Unifying the treatment of preposition-determiner contractions in German universal dependencies treebanks

  • HDT-UD, the largest German UD treebank by a large margin, as well as the German-LIT treebank, currently do not analyze preposition-determiner contractions such as zum (= zu dem, “to the”) as multi-word tokens, which is inconsistent both with UD guidelines as well as other German UD corpora (GSD and PUD). In this paper, we show that harmonizing corpora with regard to this highly frequent phenomenon using a lookup-table based approach leads to a considerable increase in automatic parsing performance.

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Stefan Grünewald, Annemarie FriedrichORCiDGND
URN:urn:nbn:de:bvb:384-opus4-1056603
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/105660
URL:https://aclanthology.org/2020.udw-1.11
ISBN:978-1-952148-48-4OPAC
Parent Title (English):Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020), December 13, 2020, Barcelona, Spain (online)
Publisher:Association for Computational Linguistics
Place of publication:Stroudsburg, PA
Editor:Marie-Catherine de Marneffe, Miryam de Lhoneux, Joakim Nivre, Sebastian Schuster
Type:Conference Proceeding
Language:English
Year of first Publication:2020
Publishing Institution:Universität Augsburg
Release Date:2023/07/10
First Page:94
Last Page:98
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Professur für Sprachverstehen mit der Anwendung Digital Humanities
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):CC-BY 4.0: Creative Commons: Namensnennung (mit Print on Demand)