Team RobertNLP at the BioCreative VII LitCovid track: neural document classification using SciBERT

  • This paper describes our submission to the BioCreative VII LitCovid track Multi-label topic classification for COVID-19 literature annotation. Our system generates embeddings for title, abstract, and keywords using the transformer-based pre-trained language model SciBERT. The classification layer consists of several multi-layer perceptrons, each predicting the applicability of a single label. Our approach, originally developed for hierarchical patent classification, shows a strong performance on the LitCovid shared task, outperforming roughly 75% of the participating systems. Keywords—document representation; multi-task learning; multi-label classification.

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Subhash Chandra Pujari, Tim Tarsi, Jannik Strötgen, Annemarie FriedrichORCiDGND
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/105623
URL:https://biocreative.bioinformatics.udel.edu/resources/publications/bc-vii-workshop-proceedings/
ISBN:978-0-578-32368-8OPAC
Parent Title (English):Proceedings of the BioCreative VII Challenge Evaluation Workshop, November 08-11, 2021
Publisher:University of Delaware
Place of publication:Newark, DE
Editor:Cecilia Arighi
Type:Conference Proceeding
Language:English
Year of first Publication:2021
Release Date:2023/07/10
First Page:332
Last Page:335
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Professur für Sprachverstehen mit der Anwendung Digital Humanities
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik