A Graph Algorithmic Framework for the Assembly of Shredded Documents
- In this paper we propose a framework to address the reassembly of shredded documents. Inspired by the way humans approach this problem we introduce a novel algorithm that iteratively determines groups of fragments that fit together well. We identify such groups by evaluating a set of constraints that takes into account shape- and content-based information of each fragment. Accordingly, we choose the best matching groups of fragments during each iteration and implicitly determine a maximum spanning tree of a graph that represents alignments between the individual fragments. After each iteration we update the graph with respect to additional contextual knowledge. We evaluate the effectiveness of our approach on a dataset of 16 fragmented pages with strongly varying content. The robustness of the proposed algorithm is finally shown in situations in which material is lost.
Author: | Fabian Richter, Christian X. Ries, Rainer LienhartORCiDGND |
---|---|
URN: | urn:nbn:de:bvb:384-opus4-12442 |
Frontdoor URL | https://opus.bibliothek.uni-augsburg.de/opus4/1557 |
Series (Serial Number): | Reports / Technische Berichte der Fakultät für Angewandte Informatik der Universität Augsburg (2011-05) |
Type: | Report |
Language: | English |
Publishing Institution: | Universität Augsburg |
Release Date: | 2011/08/12 |
Tag: | document assembly; spanning tree algorithm; Kruskal |
Institutes: | Fakultät für Angewandte Informatik |
Fakultät für Angewandte Informatik / Institut für Informatik | |
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Maschinelles Lernen und Maschinelles Sehen | |
Dewey Decimal Classification: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik |
Licence (German): | Deutsches Urheberrecht |