On the role of benchmarking data sets and simulations in method comparison studies

  • Method comparisons are essential to provide recommendations and guidance for applied researchers, who often have to choose from a plethora of available approaches. While many comparisons exist in the literature, these are often not neutral but favor a novel method. Apart from the choice of design and a proper reporting of the findings, there are different approaches concerning the underlying data for such method comparison studies. Most manuscripts on statistical methodology rely on simulation studies and provide a single real-world data set as an example to motivate and illustrate the methodology investigated. In the context of supervised learning, in contrast, methods are often evaluated using so-called benchmarking data sets, that is, real-world data that serve as gold standard in the community. Simulation studies, on the other hand, are much less common in this context. The aim of this paper is to investigate differences and similarities between these approaches, to discuss theirMethod comparisons are essential to provide recommendations and guidance for applied researchers, who often have to choose from a plethora of available approaches. While many comparisons exist in the literature, these are often not neutral but favor a novel method. Apart from the choice of design and a proper reporting of the findings, there are different approaches concerning the underlying data for such method comparison studies. Most manuscripts on statistical methodology rely on simulation studies and provide a single real-world data set as an example to motivate and illustrate the methodology investigated. In the context of supervised learning, in contrast, methods are often evaluated using so-called benchmarking data sets, that is, real-world data that serve as gold standard in the community. Simulation studies, on the other hand, are much less common in this context. The aim of this paper is to investigate differences and similarities between these approaches, to discuss their advantages and disadvantages, and ultimately to develop new approaches to the evaluation of methods picking the best of both worlds. To this aim, we borrow ideas from different contexts such as mixed methods research and Clinical Scenario Evaluation.show moreshow less

Download full text files

Export metadata

Statistics

Number of document requests

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Sarah FriedrichORCiDGND, Tim Friede
URN:urn:nbn:de:bvb:384-opus4-1022960
Frontdoor URLhttps://opus.bibliothek.uni-augsburg.de/opus4/102296
ISSN:0323-3847OPAC
ISSN:1521-4036OPAC
Parent Title (English):Biometrical Journal
Publisher:Wiley
Type:Article
Language:English
Year of first Publication:2024
Publishing Institution:Universität Augsburg
Release Date:2023/02/27
Tag:Statistics, Probability and Uncertainty; General Medicine; Statistics and Probability
Volume:66
Issue:1
First Page:2200212
DOI:https://doi.org/10.1002/bimj.202200212
Institutes:Mathematisch-Naturwissenschaftlich-Technische Fakultät
Mathematisch-Naturwissenschaftlich-Technische Fakultät / Institut für Mathematik
Mathematisch-Naturwissenschaftlich-Technische Fakultät / Institut für Mathematik / Lehrstuhl für Mathematical Statistics and Artificial Intelligence in Medicine
Dewey Decimal Classification:5 Naturwissenschaften und Mathematik / 51 Mathematik / 510 Mathematik
Licence (German):CC-BY 4.0: Creative Commons: Namensnennung (mit Print on Demand)