Regularization approaches in clinical biostatistics: a review of methods and their applications

  • A range of regularization approaches have been proposed in the data sciences to overcome overfitting, to exploit sparsity or to improve prediction. Using a broad definition of regularization, namely controlling model complexity by adding information in order to solve ill-posed problems or to prevent overfitting, we review a range of approaches within this framework including penalization, early stopping, ensembling and model averaging. Aspects of their practical implementation are discussed including available R-packages and examples are provided. To assess the extent to which these approaches are used in medicine, we conducted a review of three general medical journals. It revealed that regularization approaches are rarely applied in practical clinical applications, with the exception of random effects models. Hence, we suggest a more frequent use of regularization approaches in medical research. In situations where also other approaches work well, the only downside of theA range of regularization approaches have been proposed in the data sciences to overcome overfitting, to exploit sparsity or to improve prediction. Using a broad definition of regularization, namely controlling model complexity by adding information in order to solve ill-posed problems or to prevent overfitting, we review a range of approaches within this framework including penalization, early stopping, ensembling and model averaging. Aspects of their practical implementation are discussed including available R-packages and examples are provided. To assess the extent to which these approaches are used in medicine, we conducted a review of three general medical journals. It revealed that regularization approaches are rarely applied in practical clinical applications, with the exception of random effects models. Hence, we suggest a more frequent use of regularization approaches in medical research. In situations where also other approaches work well, the only downside of the regularization approaches is increased complexity in the conduct of the analyses which can pose challenges in terms of computational resources and expertise on the side of the data analyst. In our view, both can and should be overcome by investments in appropriate computing facilities and educational resources.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Sarah FriedrichGND, Andreas Groll, Katja Ickstadt, Thomas Kneib, Markus Pauly, Jörg Rahnenführer, Tim Friede
URN:urn:nbn:de:bvb:384-opus4-1006519
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/100651
ISSN:0962-2802OPAC
ISSN:1477-0334OPAC
Parent Title (English):Statistical Methods in Medical Research
Publisher:SAGE Publications
Type:Article
Language:English
Year of first Publication:2023
Publishing Institution:Universität Augsburg
Release Date:2023/01/09
Tag:Health Information Management; Statistics and Probability; Epidemiology
Volume:32
Issue:2
First Page:425
Last Page:440
DOI:https://doi.org/10.1177/09622802221133557
Institutes:Mathematisch-Naturwissenschaftlich-Technische Fakultät
Mathematisch-Naturwissenschaftlich-Technische Fakultät / Institut für Mathematik
Mathematisch-Naturwissenschaftlich-Technische Fakultät / Institut für Mathematik / Lehrstuhl für Mathematical Statistics and Artificial Intelligence in Medicine
Dewey Decimal Classification:5 Naturwissenschaften und Mathematik / 51 Mathematik / 510 Mathematik
Licence (German):CC-BY 4.0: Creative Commons: Namensnennung (mit Print on Demand)