Centering theory in natural text: a large-scale corpus study

  • We present an extensive corpus study of Centering Theory (CT), examining how adequately CT models coherence in a large body of natural text. A novel analysis of transition bigrams provides strong empirical support for several CT-related linguistic claims which so far have been investigated only on various small data sets. The study also reveals genre-based differences in texts’ degrees of entity coherence. Previous work has shown unsupervised CTbased coherence metrics to be unable to outperform a simple baseline. We identify two reasons: 1) these metrics assume that some transition types are more coherent and that they occur more frequently than others, but in our corpus the latter is not the case; and 2) the original sentence order of a document and a random permutation of its sentences differ mostly in the fraction of entity-sharing sentence pairs, exactly the factor measured by the baseline.

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Annemarie FriedrichORCiDGND, Alexis Palmer
URN:urn:nbn:de:bvb:384-opus4-1056989
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/105698
URL:https://nbn-resolving.org/urn:nbn:de:gbv:hil2-opus-2893
ISBN:978-3-934105-46-1OPAC
Parent Title (English):Proceedings of the 12th edition of the KONVENS conference, October 8–10, 2014, Hildesheim, Germany
Publisher:Universitätsverlag Hildesheim
Place of publication:Hildesheim
Editor:Josef Ruppenhofer, Gertrud Faaß
Type:Conference Proceeding
Language:English
Year of first Publication:2014
Publishing Institution:Universität Augsburg
Release Date:2023/07/10
First Page:137
Last Page:144
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Professur für Sprachverstehen mit der Anwendung Digital Humanities
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):CC-BY 3.0: Creative Commons - Namensnennung (mit Print on Demand)