Dear colleagues,
Since our prior Perspectives piece on artificial intelligence (AI) in GI and Hepatology in 2022, the field has seen almost exponential growth. Expectations are high that AI will revolutionize our field and significantly improve patient care. But as the global discussion on AI has shown, there are real challenges with adoption, including issues with accuracy, reliability, and privacy.
In this issue, Dr. Nabil M. Mansour and Dr. Thomas R. McCarty explore the current and future impact of AI on gastroenterology, while Dr. Basile Njei and Yazan A. Al Ajlouni assess its role in hepatology. We hope these pieces will help your discussions in incorporating or researching AI for use in your own practices. We welcome your thoughts on this issue on X @AGA_GIHN.
Gyanprakash A. Ketwaroo, MD, MSc, is associate professor of medicine, Yale University, New Haven, Conn., and chief of endoscopy at West Haven (Conn.) VA Medical Center. He is an associate editor for GI & Hepatology News.
Artificial Intelligence in Gastrointestinal Endoscopy
BY THOMAS R. MCCARTY, MD, MPH; NABIL M. MANSOUR, MD
The last few decades have seen an exponential increase and interest in the role of artificial intelligence (AI) and adoption of deep learning algorithms within healthcare and patient care services. The field of gastroenterology and endoscopy has similarly seen a tremendous uptake in acceptance and implementation of AI for a variety of gastrointestinal conditions. The spectrum of AI-based applications includes detection or diagnostic-based as well as therapeutic assistance tools. From the first US Food and Drug Administration (FDA)-approved device that uses machine learning to assist clinicians in detecting lesions during colonoscopy, to other more innovative machine learning techniques for small bowel, esophageal, and hepatobiliary conditions, AI has dramatically changed the landscape of gastrointestinal endoscopy.
Approved applications for colorectal cancer
In an attempt to improve colorectal cancer screening and outcomes related to screening and surveillance, efforts have been focused on procedural performance metrics, quality indicators, and tools to aid in lesion detection and improve quality of care. One such tool has been computer-aided detection (CADe), with early randomized controlled trial (RCT) data showing significantly increased adenoma detection rate (ADR) and adenomas per colonoscopy (APC).1-3
Ultimately, this data led to FDA approval of the CADe system GI Genius (Medtronic, Dublin, Ireland) in 2021.4 Additional systems have since been FDA approved or 510(k) cleared including Endoscreener (Wision AI, Shanghai, China), SKOUT (Iterative Health, Cambridge, Massachusetts), MAGENTIQ-COLO (MAGENTIQ-EYE LTD, Haifa, Israel), and CAD EYE (Fujifilm, Tokyo), all of which have shown increased ADR and/or increased APC and/or reduced adenoma miss rates in randomized trials.5
Yet despite the promise of improved quality and subsequent translation to better patient outcomes, there has been a noticeable disconnect between RCT data and more real-world literature.6 In a recent study, no improvement was seen in ADR after implementation of a CADe system for colorectal cancer screening — including both higher and lower-ADR performers. Looking at change over time after implementation, CADe had no positive effect in any group over time, divergent from early RCT data. In a more recent multicenter, community-based RCT study, again CADe did not result in a statistically significant difference in the number of adenomas detected.7 The differences between some of these more recent “real-world” studies vs the majority of data from RCTs raise important questions regarding the potential of bias (due to unblinding) in prospective trials, as well as the role of the human-AI interaction.
Importantly for RCT data, both cohorts in these studies met adequate ADR benchmarks, though it remains unclear whether a truly increased ADR necessitates better patient outcomes — is higher always better? In addition, an important consideration with evaluating any AI/CADe system is that they often undergo frequent updates, each promising improved accuracy, sensitivity, and specificity. This is an interesting dilemma and raises questions about the enduring relevance of studies conducted using an outdated version of a CADe system.
Additional unanswered questions regarding an ideal ADR for implementation, preferred patient populations for screening (especially for younger individuals), and the role and adoption of computer-aided polyp diagnosis/characterization (CADx) within the United States remain. Furthermore, questions regarding procedural withdrawal time, impact on sessile serrated lesion detection, cost-effectiveness, and preferred adoption strategies have begun to be explored, though require more data to better define a best practice approach. Ultimately, answers to some of these unknowns may explain the discordant results and help guide future implementation measures.