States, events, and generics: computational modeling of situation entity types

  • This dissertation addresses the computational modeling of situation entity types (Smith, 2003), an inventory of clause types capturing aspectual and semantic distinctions that are relevant for various natural language processing tasks including temporal discourse processing and information extraction. The focus of our work is on automatically identifying the situation entity types STATE, ("John is tall"), EVENT ("John won the race"), GENERALIZING SENTENCE ("John cycles to work") and GENERIC SENTENCE ("Elephants are mammals"). We create a large corpus of texts from a variety of genres and domains, annotating each clause with its situation entity type and with linguistic phenomena that we identify as relevant for distinguishing the types. Specifically, we mark each clause with its lexical aspectual class, which takes the values stative ("be," "know") or dynamic ("run," "win"), and whether the clause is episodic or habitual, i.e., whether it refers to a particular event or whether itThis dissertation addresses the computational modeling of situation entity types (Smith, 2003), an inventory of clause types capturing aspectual and semantic distinctions that are relevant for various natural language processing tasks including temporal discourse processing and information extraction. The focus of our work is on automatically identifying the situation entity types STATE, ("John is tall"), EVENT ("John won the race"), GENERALIZING SENTENCE ("John cycles to work") and GENERIC SENTENCE ("Elephants are mammals"). We create a large corpus of texts from a variety of genres and domains, annotating each clause with its situation entity type and with linguistic phenomena that we identify as relevant for distinguishing the types. Specifically, we mark each clause with its lexical aspectual class, which takes the values stative ("be," "know") or dynamic ("run," "win"), and whether the clause is episodic or habitual, i.e., whether it refers to a particular event or whether it generalizes over situations. In addition, we annotate whether a clause's subject is generic or not, i.e., whether it refers to a kind ("dogs") or to a particular individual ("my dog"). Our human annotators achieve substantial agreement for all of these annotation tasks. Based on this corpus, we conduct a detailed corpus-linguistic study of situation entity type distributions and variation in inter-annotator agreement depending on the genre. In the second part of this dissertation, we create computational models for each of the above mentioned classification tasks in a supervised setting, advancing the state-of-the-art in each case. We find a range of syntactic-semantic features including distributional information and corpus-based linguistic indicators to be helpful. Using a sequence labeling method, we are able to leverage discourse information in order to improve the recognition of genericity, which often cannot be decided without taking the sentences in the context into account. We show our models to perform robustly across domains. Our publicly available data set and implementation form the basis for future research on situation entity types and related aspectual phenomena, among others as a preprocessing step into various natural language processing tasks.show moreshow less

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Annemarie FriedrichORCiDGND
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/105681
Title Additional (German):Zustände, Ereignisse und Generizität: computergestützte Modellierung von Situationstypen
Publisher:Universität des Saarlandes
Place of publication:Saarbrücken
Type:Book
Language:English
Year of first Publication:2017
Release Date:2023/07/10
Pagenumber:217
Note:
Zugl.: Dissertation, Universität des Saarlandes, 2017
DOI:https://doi.org/10.22028/D291-23666
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Professur für Sprachverstehen mit der Anwendung Digital Humanities