Misspellings in responses to listening comprehension questions: prospects for scoring based on phonetic normalization

  • Automated scoring systems which evaluate content require robust ways of dealing with form errors. The work presented in this paper is set in the context of scoring learners’ responses to listening comprehension items included in a placement test of German as a foreign language. Based on a corpus of over 3000 responses to 17 questions, by test takers of different language proficiencies, we perform a quantitative analysis of the diversity in misspellings. We evaluate the performance of an off-the-shelf open source spell-checker on our data showing that around 45% of the reported non-word errors are not correctly accounted for, that is, they are either falsely identified as misspelt or the spell-checker is unable to identify the intended word. We propose to address misspellings in computer-based scoring of constructed response items by means of phonetic normalization. Learner responses transcribed into Soundex codes and into two encodings borrowed from historical linguistics (ASJP andAutomated scoring systems which evaluate content require robust ways of dealing with form errors. The work presented in this paper is set in the context of scoring learners’ responses to listening comprehension items included in a placement test of German as a foreign language. Based on a corpus of over 3000 responses to 17 questions, by test takers of different language proficiencies, we perform a quantitative analysis of the diversity in misspellings. We evaluate the performance of an off-the-shelf open source spell-checker on our data showing that around 45% of the reported non-word errors are not correctly accounted for, that is, they are either falsely identified as misspelt or the spell-checker is unable to identify the intended word. We propose to address misspellings in computer-based scoring of constructed response items by means of phonetic normalization. Learner responses transcribed into Soundex codes and into two encodings borrowed from historical linguistics (ASJP and Dolgopolsky’s sound classes) are compared to transcribed reference answers using string distance measures. We show that reliable correlation with teachers’ scores can be obtained, however, similarity thresholds are item-specific.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Heike da Silva CardosoORCiDGND, Magdalena Wolska
URN:urn:nbn:de:bvb:384-opus4-1134421
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/113442
URL:https://ep.liu.se/en/conference-article.aspx?series=ecp&issue=114&Article_No=2
ISBN:978-91-7519-036-5OPAC
ISSN:1650-3686OPAC
Parent Title (English):Proceedings of the 4th workshop on NLP for Computer Assisted Language Learning at NODALIDA 2015, 11th May 2015, Vilnius, Lithuania
Publisher:Linköping University Electronic Press
Place of publication:Linköping
Editor:Elena Volodina, Lars Borin, Ildikó Pilán
Type:Conference Proceeding
Language:English
Year of first Publication:2015
Publishing Institution:Universität Augsburg
Release Date:2024/06/13
First Page:1
Last Page:10
Series:Linköping Electronic Conference Proceedings ; 114:2
Series:NEALT Proceedings Series ; 26:2
Institutes:Universität Serviceeinrichtungen
Universität Serviceeinrichtungen / Universitätsbibliothek
Dewey Decimal Classification:4 Sprache / 40 Sprache / 400 Sprache
Licence (German):CC-BY 2.0: Creative Commons - Namensnennung (mit Print on Demand)