Workshops & Tutorials

We are pleased to announce that two workshops and two tutorials will take place at DocEng 2016 Vienna, on Tuesday, 13 September, 2016. Find details about the schedule in the program folder.

The workshops are:

  • DChanges 2016 – Document Changes: Modeling, Detection, Storage and Visualization

Organizers: Giole Barabucci, Uwe M. Borghoff, Angelo Di Iorio, Sonja Schimmler, Ethan Munson

Workshop website

The focus of the workshop is the application of diff algorithms to documents that are collaboratively edited via the web. Creating documents via web interfaces is for many people, especially non-technical people, the most common and practical way to collaborate with others. The probably most well-known examples of web-based collaborative editors are Google Docs and Wikipedia pages, but many other platforms are being used as well. Studying and analyzing these environments, however, requires a new body of knowledge. Our current definition of “document”, usually strongly connected to that of “file”, has to be reviewed. The technical way in which the analysis of documents is carried out must also change, as these documents are usually accessed piece-wise via remote APIs, and not in one big chunk as locally stored binary objects. Research on these topics is emerging and we would like to stimulate and promote it even further.

Other topics of interest are: diffing and change tracking algorithms (finding changes on trees, graphs, diagrams and any kind of document, and at different levels of abstraction), merging (management of update conflicts, n-way merge algorithms), and applications of diff techniques from and to other domains (software-engineering, ontology management, humanities, law, medicine). In line with the past years, the workshop aims at bringing together researchers and practitioners from industry and academia, to discuss these issues in an informal setting and to foster collaboration among them.

–  Abstracts are due: June 19th, 2016

–  Papers and notes are due: July 10th, 2016

–  Camera ready: August 28th, 2016

 

  • Workshop on Future Publishing Formats

Organizers: Michael Piotrowski

The familiar PDF-based scholarly publishing workflow—which emulates even earlier paper-based workflows—has been surprisingly resistent to change. However, it is becoming increasingly clear that it no longer meets the requirements of a quickly evolving scholarly, technical, and political environment, which includes the trend towards open access publishing, reproducible research, mobile devices, linked open data, and many other topics.

This workshop approaches scholarly publishing from a document engineering perspective and focuses on the question of document formats for submission, review, publication, and archival of scholarly publications. We will discuss the current state of scholarly publishing from a document engineering point of view, with the explicit goal of identifying potential alternatives to the current workflow.

Topics to be discussed thus include: :

    • Issues with the current PDF-based workflow
    • Areas that need immediate improvement and potential longer-term improvements
    • Alternative publishing formats and markup languages
    • Semantic publishing approaches, such as nanopublications, linked open data, etc.
    • New views on single-source publishing
    • Evaluation criteria from an author’s perspective
    • Evaluation criteria from the perspective of document engineering research and experience
    • Implications for authoring, submission, review, and archival
    • Benefits and downsides of publishing formats

This is a workshop in the original sense of the word, i.e., a working session, not a “mini conference.” It is scheduled for a half-day slot, opened by an introductory presentation and followed by small group discussions on the questions listed above and a plenary session for building consensus.

Attendees are invited to submit suggestions to the organizer in advance. Such suggestions might include information on experiences with alternative publishing formats and workflows, related work, links to relevant software, and so forth.

The tutorials are:

  • Table Modelling, Extraction and Processing

Organizers: Max Göbel, Tamir Hassan, Ermelinda Oro, Roya Rastan.

Tables are of particular interest to document engineers, as they contain information in a human readable, structured form that can relatively easily be made machine readable. The common problem of extracting information from tables, as well as the less common problem of automatically creating tables from structured data, have been addressed by numerous researchers in the past. Many of these works have proposed novel approaches for modelling such transformations and separating the data from its presentation.

The tutorial will provide a general overview of these methods and be geared to the general document engineering community without prior experience in this area. The tutorial will cover the following topics:

    • Definition of table in general terms
    • Modelling of data in tables
    • Table Extraction and Table Understanding
    • Survey of table extraction systems techniques and applications
    • Overview of the ICDAR 2013 Table Competition
    • Brief overview of techniques used for performance evaluation of table processing systems
    • Automatic table synthesis from a data source
    • Future work/open research problems.
  • Document Engineering Issues in Malware Analysis

Organizers: Robert Brandon, Charles Nicholas

The focus of the tutorial will be an overview of the field of malware analysis with emphasis on issues related to document engineering. We introduce the field with a discussion of the types of malware, including executable binaries, malicious PDFs, and exploit kits. The most popular tools used for analyzing malicious binaries will be presented and demonstrated, including IDA and OllyDbg. Concepts and tools from static and binary analysis will be discussed. Some collections of malware specimens are available to researchers, and these will be used as examples as appropriate. We will discuss cluster analysis, malware attribution, and the problems caused by polymorphic malware. We will conclude with our view of important research questions in the field.

Contact

For any question on workshops & tutorials, please don’t hesitate to contact Sonja Schimmler.