Team RobertNLP at the BioCreative VII LitCovid track: neural document classification using SciBERT
- This paper describes our submission to the BioCreative VII LitCovid track Multi-label topic classification for COVID-19 literature annotation. Our system generates embeddings for title, abstract, and keywords using the transformer-based pre-trained language model SciBERT. The classification layer consists of several multi-layer perceptrons, each predicting the applicability of a single label. Our approach, originally developed for hierarchical patent classification, shows a strong performance on the LitCovid shared task, outperforming roughly 75% of the participating systems. Keywords—document representation; multi-task learning; multi-label classification.
Author: | Subhash Chandra Pujari, Tim Tarsi, Jannik Strötgen, Annemarie FriedrichORCiDGND |
---|---|
Frontdoor URL | https://opus.bibliothek.uni-augsburg.de/opus4/105623 |
URL: | https://biocreative.bioinformatics.udel.edu/resources/publications/bc-vii-workshop-proceedings/ |
ISBN: | 978-0-578-32368-8OPAC |
Parent Title (English): | Proceedings of the BioCreative VII Challenge Evaluation Workshop, November 08-11, 2021 |
Publisher: | University of Delaware |
Place of publication: | Newark, DE |
Editor: | Cecilia Arighi |
Type: | Conference Proceeding |
Language: | English |
Year of first Publication: | 2021 |
Release Date: | 2023/07/10 |
First Page: | 332 |
Last Page: | 335 |
Institutes: | Fakultät für Angewandte Informatik |
Fakultät für Angewandte Informatik / Institut für Informatik | |
Fakultät für Angewandte Informatik / Institut für Informatik / Professur für Sprachverstehen mit der Anwendung Digital Humanities | |
Dewey Decimal Classification: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik |