Schedule

Download the program folder here.

Overview| Wednesday| Thursday| Friday

Wednesday, September 14th

08:45-09:30 Registration
09:30-09:45 Session 1: Welcome and Introduction
09:45-10:45 Session 2: Keynote I
09:45

Research Infrastructures, or How Document Engineering, Cultural Heritage, and Digital Humanities Can Go Together ( abstract )
10:45-11:15 Coffee Break
11:15-12:30 Session 3: Layouts and Publishing
11:15

A general framework for globally optimized pagination ( abstract )
11:45

Aesthetic Measures for Document Layouts: Operationalization and Analysis in the Context of Marketing Brochures ( abstract )
12:15

METIS: A Multi-faceted Hybrid Book Learning Platform ( abstract )
12:30-12:45 Session 4: BoF: How It Works
12:45-14:15 Lunch Break (incl. BoF)
14:15-15:30 Session 5: XML & Data Modelling
14:15

Digital Preservation Based on Contextualized Dependencies ( abstract )
14:45

Schema-aware extended Annotation Graphs ( abstract )
15:15

NCM 3.1: A Conceptual Model for Hyperknowledge Document Engineering ( abstract )
15:30-16:00 Coffee Break
16:00-17:00 Session 6: ProDoc: Doctoral Consortium
16:00

A Language Theoretical Framework For The Integration Of Arts And Humanities Research Data ( abstract )
16:30

Towards supporting multimodal and multiuser interactions in multimedia languages ( abstract )
17:00-18:15 Session 7: Text Analysis I: Similarity
Chair:
17:00

Using a Dictionary and n-gram Alignment to Improve Fine-grained Cross-Language Plagiarism Detection ( abstract )
17:30

Relaxing Orthogonality Assumption in Conceptual Text Document Similarity ( abstract )
18:00

Enhancing the Searchability of Page-Image PDF Documents Using an Aligned Hidden Layer from a Truth Text ( abstract )
19:00-23:00 Welcome Reception, TU the Sky
Thursday, September 15th

09:30-10:30 Session 8: Keynote II
Chair:
09:30

Design Is Not What You Think It Is ( abstract )
10:30-11:00 Coffee Break
11:00-12:15 Session 9: Text Analysis II: Classification
11:00

SEL: a Unified Algorithm for Entity Linking and Saliency Detection ( abstract )
11:30

Automated Intrinsic Text Classification for Component Content Management Applications in Technical Communication ( abstract )
11:45

Centroid Terms as Text Representatives ( abstract )
12:00

Frequent Multi-Byte Character Subtring Extraction using a Succinct Data Structure ( abstract )
12:15-12:45 Session 10: BoF: The Results
12:45-14:15 Lunch Break
14:15-14:45 Session 11: Workshop Session Recap
14:45-15:05 Session 12: Poster Lightning Talks
14:45

Multilingual News Article Summarization in Mobile Devices – Demo ( abstract )
14:47

Rendering Mathematic Formulas for the Web in Madoko ( abstract )
14:49

A PDF Wrapper for Table Processing ( abstract )
14:51

Configurable Table Structure Recognition in Untagged PDF documents ( abstract )
14:53

Extending data models by declaratively specifying contextual knowledge ( abstract )
14:55

Using Convolutional Neural Networks for Content Extraction from Online Flyers ( abstract )
14:57

Combining Taxonomies using Word2Vec ( abstract )
14:59

Important Word Organization for Support of Browsing Scholarly Papers Using Author Keywords ( abstract )
15:01

Selecting Features with Class Based and Importance Weighted Document Frequency in Text Classification ( abstract )
15:03

Bayesian mixture models on connected components for Newspaper article segmentation ( abstract )
15:05-16:00 Coffee & Poster Session
16:00-16:45 Session 13: Text Analysis III: Summarization
16:00

Appling Link Target Identification and Content Extraction to improve Web News Summarization ( abstract )
16:15

Towards Cohesive Extractive Summarization through Anaphoric Expression Resolution ( abstract )
16:30

Assessing Concept Weighting in Integer Linear Programming based Single-document Summarization ( abstract )
16:45-17:30 Session 14: SIGWEB Presentation
19:30-23:59 Conference Banquet, Rathaus (City Hall)
Friday, September 16th

09:30-10:45 Session 15: Applications & Security
09:30

A Lightweight and Efficient Mechanism for Fixing the Synchronization of Misaligned Subtitle Documents ( abstract )
10:00

DocuGram: Turning Screen Recordings into Documents ( abstract )
10:15

An Exploratory Study on Managing and Searching for Documents in Software Engineering Environments ( abstract )
10:30

Mass Serialization Method for Document Encryption Policy Enforcement ( abstract )
10:45-11:15 Coffee Break
11:15-12:45 Session 16: Visual Document Analysis
Chair:
11:15

Generation of Search-able PDF of the Chemical Equations segmented from Document Images ( abstract )
11:45

A Multimodal Crowdsourcing Framework for Transcribing Historical Handwritten Documents ( abstract )
12:15

Embedded Textual Content for Document Image Classification with Convolutional Neural Networks ( abstract )
12:45-13:00 Session 17: Closing Notes
13:00-14:30 Lunch Break