Towards LLOD-based language contact studies: a case study in interoperability

Chiarcos, Christian; Donandt, Kathrin; Sargsian, Hasmik; Ionov, M.; Wichers Schreur, Jesse

Towards LLOD-based language contact studies: a case study in interoperability

Christian Chiarcos, Kathrin Donandt, Hasmik Sargsian, M. Ionov, Jesse Wichers Schreur

We describe a methodological and technical framework for conducting qualitative and quantitative studies of linguistic research questions over diverse and heterogeneous data sources such as corpora and elicitations. We demonstrate how LLOD formalisms can be employed to develop extraction pipelines for features and linguistic examples from corpora and collections of interlinear glossed text, and furthermore, how SPARQL UPDATE can be employed (1) to normalize diverse data against a reference data model (here, POWLA), (2) to harmonize annotation vocabularies by reference to terminology repositories (here, OLiA), (3) to extract examples from these normalized data structures regardless of their origin, and (4) to implement this extraction routine in a tool-independent manner for different languages with different annotation schemes. We demonstrate our approach for language contact studies for genetically unrelated, but neighboring languages from the Caucasus area, Eastern Armenian andWe describe a methodological and technical framework for conducting qualitative and quantitative studies of linguistic research questions over diverse and heterogeneous data sources such as corpora and elicitations. We demonstrate how LLOD formalisms can be employed to develop extraction pipelines for features and linguistic examples from corpora and collections of interlinear glossed text, and furthermore, how SPARQL UPDATE can be employed (1) to normalize diverse data against a reference data model (here, POWLA), (2) to harmonize annotation vocabularies by reference to terminology repositories (here, OLiA), (3) to extract examples from these normalized data structures regardless of their origin, and (4) to implement this extraction routine in a tool-independent manner for different languages with different annotation schemes. We demonstrate our approach for language contact studies for genetically unrelated, but neighboring languages from the Caucasus area, Eastern Armenian and Georgian.…

Metadaten
Author:	Christian Chiarcos ORCiD GND, Kathrin Donandt, Hasmik Sargsian, M. Ionov, Jesse Wichers Schreur
URN:	urn:nbn:de:bvb:384-opus4-1040947
Frontdoor URL	https://opus.bibliothek.uni-augsburg.de/opus4/104094
URL:	http://lrec-conf.org/workshops/lrec2018/W23/index.html
ISBN:	979-10-95546-19-1OPAC
Parent Title (English):	Proceedings of the 6th Workshop on Linked Data in Linguistics: Towards Linguistic Data Science, co-located with LREC2018, 12 May 2018, Miyazaki, Japan
Publisher:	European Language Resources Association
Place of publication:	Paris
Editor:	John P. McCrae, Christian ChiarcosORCiD GND, Thierry Declerck, Jorge Gracia, Bettina Klimek
Type:	Conference Proceeding
Language:	English
Year of first Publication:	2018
Publishing Institution:	Universität Augsburg
Release Date:	2023/05/16
Institutes:	Philologisch-Historische Fakultät
	Philologisch-Historische Fakultät / Angewandte Computerlinguistik
	Philologisch-Historische Fakultät / Angewandte Computerlinguistik / Lehrstuhl für Angewandte Computerlinguistik (ACoLi)
Dewey Decimal Classification:	4 Sprache / 40 Sprache / 400 Sprache
Licence (German):	CC-BY-NC 4.0: Creative Commons: Namensnennung - Nicht kommerziell (mit Print on Demand)

Open Access

Towards LLOD-based language contact studies: a case study in interoperability

Download full text files

Export metadata

Statistics

Print On Demand

Additional Services