Word segmentation for Akkadian cuneiform

  • We present experiments on word segmentation for Akkadian cuneiform, an ancient writing system and a language used for about 3 millennia in the ancient Near East. To our best knowledge, this is the first study of this kind applied to either the Akkadian language or the cuneiform writing system. As a logosyllabic writing system, cuneiform structurally resembles Eastern Asian writing systems, so, we employ word segmentation algorithms originally developed for Chinese and Japanese. We describe results of rule-based algorithms, dictionary-based algorithms, statistical and machine learning approaches. Our results may indicate possible promising steps in cuneiform word segmentation that can create and improve natural language processing in this area.

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Timo Homburg, Christian ChiarcosORCiDGND
URN:urn:nbn:de:bvb:384-opus4-1041261
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/104126
URL:https://aclanthology.org/L16-1642
URL:http://www.lrec-conf.org/proceedings/lrec2016/index.html
ISBN:978-2-9517408-9-1OPAC
Parent Title (English):Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), May 23-28, 2016, Portorož, Slovenia
Publisher:European Language Resources Association
Place of publication:Paris
Editor:Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Type:Conference Proceeding
Language:English
Year of first Publication:2016
Publishing Institution:Universität Augsburg
Release Date:2023/05/16
First Page:4067
Last Page:4074
Institutes:Philologisch-Historische Fakultät
Philologisch-Historische Fakultät / Angewandte Computerlinguistik
Philologisch-Historische Fakultät / Angewandte Computerlinguistik / Lehrstuhl für Angewandte Computerlinguistik (ACoLi)
Dewey Decimal Classification:4 Sprache / 40 Sprache / 400 Sprache
Licence (German):CC-BY-NC 4.0: Creative Commons: Namensnennung - Nicht kommerziell (mit Print on Demand)