Fine-tuning large language models for digital forensics: case study and general recommendations

Large language models (LLMs) have rapidly gained popularity in various fields, including digital forensics (DF), where they offer the potential to accelerate investigative processes. Although several studies have explored LLMs for tasks such as evidence identification, artifact analysis, and report writing, fine-tuning models for specific forensic applications remains underexplored. This paper addresses this gap by proposing recommendations for fine-tuning LLMs tailored to digital forensics tasks. A case study on chat summarization is presented to showcase the applicability of the recommendations, where we evaluate multiple fine-tuned models to assess their performance. The study concludes with sharing the lessons learned from the case study.

Metadaten
Author:	Gaëtan Michelet, Hans Henseler, Harm van Beek, Mark Scanlon, Frank Breitinger ORCiD GND
Frontdoor URL	https://opus.bibliothek.uni-augsburg.de/opus4/124192
ISSN:	2576-5337OPAC
Parent Title (English):	Digital Threats: Research and Practice
Publisher:	Association for Computing Machinery (ACM)
Type:	Article
Language:	English
Year of first Publication:	2025
Publishing Institution:	Universität Augsburg
Release Date:	2025/08/01
DOI:	https://doi.org/10.1145/3748264
Institutes:	Fakultät für Angewandte Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Cybersicherheit
Dewey Decimal Classification:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Latest Publications (not yet published in print):	Aktuelle Publikationen (noch nicht gedruckt erschienen)

Open Access