Latest News

Is ChatGPT Reliable for CRC Screening/Surveillance Advice?


 

TOPLINE:

ChatGPT (version 3.5) provides relatively poor and inconsistent responses when asked about appropriate colorectal cancer (CRC) screening and surveillance, a new study showed.

METHODOLOGY:

  • Three board-certified gastroenterologists with 10+ years of clinical experience developed five CRC screening and five CRC surveillance clinical vignettes (with multiple choice answers), which were fed to ChatGPT version 3.5.
  • ChatGPT’s responses were recorded over four separate sessions and screened for accuracy to determine reliability of the tool.
  • The average number of correct answers was compared to that of 238 gastroenterologists and colorectal surgeons answering the same questions with and without the help of a previously validated CRC screening mobile app.

TAKEAWAY:

  • ChatGPT’s average overall performance was 45%; the average number of correct answers was 2.75 for screening and 1.75 for surveillance.
  • ChatGPT’s responses were inconsistent in a large proportion of questions; the tool gave a different answer in four questions among the different sessions.
  • The average number of total correct answers of ChatGPT was significantly lower (P < .001) than that of physicians with and without the mobile app (7.71 and 5.62 correct answers, respectively).

IN PRACTICE:

“The use of validated mobile apps with decision-making algorithms could serve as more reliable assistants until large language models developed with AI are further refined,” the authors concluded.

SOURCE:

The study, with first author Lisandro Pereyra, MD, Department of Gastroenterology, Hospital Alemán of Buenos Aires, Argentina, was published online on February 7, 2024, in the Journal of Clinical Gastroenterology.

LIMITATIONS:

The 10 clinical vignettes represented a relatively small sample size to assess accuracy. The study did not use the latest version of ChatGPT. No “fine-tuning” attempts with inputs of diverse prompts, instructions, or relevant data were performed, which could potentially improve the performance of the chatbot.

DISCLOSURES:

The study had no specific funding. The authors declared no conflicts of interest.

A version of this article appeared on Medscape.com.

Recommended Reading

Colorectal cancer incidence doubled in younger adults
Federal Practitioner
Key red flags for early-onset colorectal cancer
Federal Practitioner
Vegetarian diets tied to lower risk for some GI cancers
Federal Practitioner
Jury out on how tea drinking influences colorectal cancer risk
Federal Practitioner
Mixed CRC screening messaging. Confusing? Some docs think so
Federal Practitioner
Study reveals potentially unnecessary CRC screening in older adults
Federal Practitioner
Breastfeeding and colorectal cancer
Federal Practitioner
Jury still out on whether green tea lowers colon cancer risk
Federal Practitioner
Colorectal Cancer Risk Increasing Across Successive Birth Cohorts
Federal Practitioner
Should CRC Surveillance Extend Beyond 5 Years Post Surgery?
Federal Practitioner