A Graph Algorithmic Framework for the Assembly of Shredded Documents

  • In this paper we propose a framework to address the reassembly of shredded documents. Inspired by the way humans approach this problem we introduce a novel algorithm that iteratively determines groups of fragments that fit together well. We identify such groups by evaluating a set of constraints that takes into account shape- and content-based information of each fragment. Accordingly, we choose the best matching groups of fragments during each iteration and implicitly determine a maximum spanning tree of a graph that represents alignments between the individual fragments. After each iteration we update the graph with respect to additional contextual knowledge. We evaluate the effectiveness of our approach on a dataset of 16 fragmented pages with strongly varying content. The robustness of the proposed algorithm is finally shown in situations in which material is lost.

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Fabian Richter, Christian X. Ries, Rainer LienhartGND
URN:urn:nbn:de:bvb:384-opus4-12442
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/1557
Series (Serial Number):Reports / Technische Berichte der Fakultät für Angewandte Informatik der Universität Augsburg (2011-05)
Type:Report
Language:English
Publishing Institution:Universität Augsburg
Release Date:2011/08/12
Tag:document assembly; spanning tree algorithm; Kruskal
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Maschinelles Lernen und Maschinelles Sehen
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):Deutsches Urheberrecht