MuLMS-AZ: an argumentative zoning dataset for the materials science domain

  • Scientific publications follow conventionalized rhetorical structures. Classifying the Argumentative Zone (AZ), e.g., identifying whether a sentence states a Motivation, a Result or Background information, has been proposed to improve processing of scholarly documents. In this work, we adapt and extend this idea to the domain of materials science research. We present and release a new dataset of 50 manually annotated research articles. The dataset spans seven sub-topics and is annotated with a materials-science focused multi-label annotation scheme for AZ. We detail corpus statistics and demonstrate high inter-annotator agreement. Our computational experiments show that using domain-specific pre-trained transformer-based text encoders is key to high classification performance. We also find that AZ categories from existing datasets in other domains are transferable to varying degrees

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Timo Schrader, Teresa Bürkle, Sophie Henning, Sherry Tan, Matteo Finco, Stefan Grünewald, Maira Indrikova, Felix Hildebrand, Annemarie FriedrichORCiDGND
URN:urn:nbn:de:bvb:384-opus4-1264899
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/126489
ISBN:978-1-959429-89-0OPAC
Parent Title (English):Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023), 13-14 July 2023, Toronto, Canada
Publisher:Association for Computational Linguistics (ACL)
Place of publication:Stroudsburg, PA
Editor:Michael Strube, Chloe Braud, Christian Hardmeier, Junyi Jessy Li, Sharid Loaiciga, Amir Zeldes
Type:Conference Proceeding
Language:English
Year of first Publication:2023
Publishing Institution:Universität Augsburg
Release Date:2025/11/26
First Page:1
Last Page:15
DOI:https://doi.org/10.18653/v1/2023.codi-1.1
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Computerlinguistik
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):CC-BY 4.0: Creative Commons: Namensnennung