Team RobertNLP at the BioCreative VII LitCovid track: neural document classification using SciBERT

  • This paper describes our submission to the BioCreative VII LitCovid track Multi-label topic classification for COVID-19 literature annotation. Our system generates embeddings for title, abstract, and keywords using the transformer-based pre-trained language model SciBERT. The classification layer consists of several multi-layer perceptrons, each predicting the applicability of a single label. Our approach, originally developed for hierarchical patent classification, shows a strong performance on the LitCovid shared task, outperforming roughly 75% of the participating systems. Keywords—document representation; multi-task learning; multi-label classification.

Export metadata


Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Author:Subhash Chandra Pujari, Tim Tarsi, Jannik Strötgen, Annemarie FriedrichORCiDGND
Frontdoor URL
Parent Title (English):Proceedings of the BioCreative VII Challenge Evaluation Workshop, November 08-11, 2021
Publisher:University of Delaware
Place of publication:Newark, DE
Editor:Cecilia Arighi
Type:Conference Proceeding
Year of first Publication:2021
Release Date:2023/07/10
First Page:332
Last Page:335
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Professur für Sprachverstehen mit der Anwendung Digital Humanities
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik