GERNERMED++: semantic annotation in German medical NLP through transfer-learning, translation and word alignment

  • We present a statistical model, GERNERMED++, for German medical natural language processing trained for named entity recognition (NER) as an open, publicly available model. We demonstrate the effectiveness of combining multiple techniques in order to achieve strong results in entity recognition performance by the means of transfer-learning on pre-trained deep language models (LM), word-alignment and neural machine translation, outperforming a pre-existing baseline model on several datasets. Due to the sparse situation of open, public medical entity recognition models for German texts, this work offers benefits to the German research community on medical NLP as a baseline model. The work serves as a refined successor to our first GERNERMED model. Similar to our previous work, our trained model is publicly available to other researchers. The sample code and the statistical model is available at:

Download full text files

Export metadata


Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Author:Johann Frei, Ludwig Frei-Stuber, Frank KramerORCiDGND
Frontdoor URL
Parent Title (English):Journal of Biomedical Informatics
Publisher:Elsevier BV
Date of first Publication:2023/10/13
Publishing Institution:Universität Augsburg
Release Date:2023/10/17
Tag:Health Informatics; Computer Science Applications
First Page:104513
Institutes:Fakultät für Angewandte Informatik
Fakultät für Angewandte Informatik / Institut für Informatik
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für IT-Infrastrukturen für die Translationale Medizinische Forschung
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):CC-BY-NC-ND 4.0: Creative Commons: Namensnennung - Nicht kommerziell - Keine Bearbeitung (mit Print on Demand)