A review of "Bringing Order to Approximate Matching: Classification and Attacks on Similarity Digest Algorithms"

Martín-Pérez, Miguel; Rodríguez, Ricardo J.; Breitinger, Frank

doi:10.18239/jornadas_2021.34.29

A review of "Bringing Order to Approximate Matching: Classification and Attacks on Similarity Digest Algorithms"

Miguel Martín-Pérez, Ricardo J. Rodríguez, Frank Breitinger

Fuzzy hashing or similarity hashing (a.k.a. bytewise approximate matching) converts digital artifacts into an inter- mediate representation to allow for efﬁcient (fast) identiﬁcation of similar objects, e.g., for deny-listing. Over the past decade, new algorithms have been developed and released to the digital forensics community. When releasing algorithms (e.g., as part of a scientiﬁc article), they are frequently compared with other algorithms to outline the beneﬁts and sometimes also the weaknesses of the proposed approach. However, given the wide variety of algorithms and approaches, it is impossible to provide direct comparisons with all existing algorithms. In this paper, we present the ﬁrst classiﬁcation of approximate matching algo- rithms which allows for an easier description and comparisons. Therefore, we ﬁrst reviewed existing literature to understand the techniques various algorithms use and to familiarize ourselves with the common terminology. Our ﬁndings allowed us toFuzzy hashing or similarity hashing (a.k.a. bytewise approximate matching) converts digital artifacts into an inter- mediate representation to allow for efﬁcient (fast) identiﬁcation of similar objects, e.g., for deny-listing. Over the past decade, new algorithms have been developed and released to the digital forensics community. When releasing algorithms (e.g., as part of a scientiﬁc article), they are frequently compared with other algorithms to outline the beneﬁts and sometimes also the weaknesses of the proposed approach. However, given the wide variety of algorithms and approaches, it is impossible to provide direct comparisons with all existing algorithms. In this paper, we present the ﬁrst classiﬁcation of approximate matching algo- rithms which allows for an easier description and comparisons. Therefore, we ﬁrst reviewed existing literature to understand the techniques various algorithms use and to familiarize ourselves with the common terminology. Our ﬁndings allowed us to develop a categorization relying heavily on the terminology proposed by NIST SP 800-168. In addition to the categorization, this paper presents an abstract set of attacks against algorithms and why they are feasible. Lastly, we detail the characteristics needed to build robust algorithms to prevent attacks. We believe that this paper helps newcomers, practitioners, and experts alike to better compare algorithms, understand their potential, as well as characteristics and implications they may have on forensic investigations.…

Metadaten
Author:	Miguel Martín-Pérez, Ricardo J. Rodríguez, Frank Breitinger ORCiD GND
URN:	urn:nbn:de:bvb:384-opus4-1175627
Frontdoor URL	https://opus.bibliothek.uni-augsburg.de/opus4/117562
ISBN:	9788490444634OPAC
ISSN:	2697-049XOPAC
Parent Title (Spanish):	Investigación en ciberseguridad: actas de las VI Jornadas Nacionales (JNIC2021 LIVE), online 9-10 de junio de 2021, Universidad de Castilla-La Mancha
Publisher:	Ediciones de la Universidad de Castilla-La Mancha
Place of publication:	Cuenca
Editor:	Manuel A. Serrano, Eduardo Fernández-Medina, Cristina Alcaraz, Noemí de Castro, Guillermo Calvo
Type:	Conference Proceeding
Language:	English
Year of first Publication:	2021
Publishing Institution:	Universität Augsburg
Release Date:	2024/12/16
First Page:	131
Last Page:	132
Series:	Colección Jornadas y Congresos ; 34
DOI:	https://doi.org/10.18239/jornadas_2021.34.29
Institutes:	Fakultät für Angewandte Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Cybersicherheit
Dewey Decimal Classification:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):	CC-BY 4.0: Creative Commons: Namensnennung

Open Access

A review of "Bringing Order to Approximate Matching: Classification and Attacks on Similarity Digest Algorithms"

Download full text files

Export metadata

Statistics

Additional Services