David Roser, Michael Meinikheim, Anna Muzalyova, Robert Mendel, Christoph Palm, Andreas Probst, Sandra Nagl, Markus W. Scheppach, Christoph Römmele, Elisabeth Schnoy, Nasim Parsa, Michael F. Byrne, Helmut Messmann, Alanna Ebigbo
- Objective: Despite high stand-alone performance, studies demonstrate that artificial intelligence (AI)-supported endoscopic diagnostics often fall short in clinical applications due to human-AI interaction factors. This video-based trial on Barrett's esophagus aimed to investigate how examiner behavior, their levels of confidence, and system usability influence the diagnostic outcomes of AI-assisted endoscopy.
Methods: The present analysis employed data from a multicenter randomized controlled tandem video trial involving 22 endoscopists with varying degrees of expertise. Participants were tasked with evaluating a set of 96 endoscopic videos of Barrett's esophagus in two distinct rounds, with and without AI assistance. Diagnostic confidence levels were recorded, and decision changes were categorized according to the AI prediction. Additional surveys assessed user experience and system usability ratings.
Results: AI assistance significantly increased examiner confidence levels (pObjective: Despite high stand-alone performance, studies demonstrate that artificial intelligence (AI)-supported endoscopic diagnostics often fall short in clinical applications due to human-AI interaction factors. This video-based trial on Barrett's esophagus aimed to investigate how examiner behavior, their levels of confidence, and system usability influence the diagnostic outcomes of AI-assisted endoscopy.
Methods: The present analysis employed data from a multicenter randomized controlled tandem video trial involving 22 endoscopists with varying degrees of expertise. Participants were tasked with evaluating a set of 96 endoscopic videos of Barrett's esophagus in two distinct rounds, with and without AI assistance. Diagnostic confidence levels were recorded, and decision changes were categorized according to the AI prediction. Additional surveys assessed user experience and system usability ratings.
Results: AI assistance significantly increased examiner confidence levels (p < 0.001) and accuracy. Withdrawing AI assistance decreased confidence (p < 0.001), but not accuracy. Experts consistently reported higher confidence than non-experts (p < 0.001), regardless of performance. Despite improved confidence, correct AI guidance was disregarded in 16% of all cases, and 9% of initially correct diagnoses were changed to incorrect ones. Overreliance on AI, algorithm aversion, and uncertainty in AI predictions were identified as key factors influencing outcomes. The System Usability Scale questionnaire scores indicated good to excellent usability, with non-experts scoring 73.5 and experts 85.6.
Conclusions: Our findings highlight the pivotal function of examiner behavior in AI-assisted endoscopy. To fully realize the benefits of AI, implementing explainable AI, improving user interfaces, and providing targeted training are essential. Addressing these factors could enhance diagnostic accuracy and confidence in clinical practice.…

