Latest News

Is ChatGPT Reliable for CRC Screening/Surveillance Advice?


 

TOPLINE:

ChatGPT (version 3.5) provides relatively poor and inconsistent responses when asked about appropriate colorectal cancer (CRC) screening and surveillance, a new study showed.

METHODOLOGY:

  • Three board-certified gastroenterologists with 10+ years of clinical experience developed five CRC screening and five CRC surveillance clinical vignettes (with multiple choice answers), which were fed to ChatGPT version 3.5.
  • ChatGPT’s responses were recorded over four separate sessions and screened for accuracy to determine reliability of the tool.
  • The average number of correct answers was compared to that of 238 gastroenterologists and colorectal surgeons answering the same questions with and without the help of a previously validated CRC screening mobile app.

TAKEAWAY:

  • ChatGPT’s average overall performance was 45%; the average number of correct answers was 2.75 for screening and 1.75 for surveillance.
  • ChatGPT’s responses were inconsistent in a large proportion of questions; the tool gave a different answer in four questions among the different sessions.
  • The average number of total correct answers of ChatGPT was significantly lower (P < .001) than that of physicians with and without the mobile app (7.71 and 5.62 correct answers, respectively).

IN PRACTICE:

“The use of validated mobile apps with decision-making algorithms could serve as more reliable assistants until large language models developed with AI are further refined,” the authors concluded.

SOURCE:

The study, with first author Lisandro Pereyra, MD, Department of Gastroenterology, Hospital Alemán of Buenos Aires, Argentina, was published online on February 7, 2024, in the Journal of Clinical Gastroenterology.

LIMITATIONS:

The 10 clinical vignettes represented a relatively small sample size to assess accuracy. The study did not use the latest version of ChatGPT. No “fine-tuning” attempts with inputs of diverse prompts, instructions, or relevant data were performed, which could potentially improve the performance of the chatbot.

DISCLOSURES:

The study had no specific funding. The authors declared no conflicts of interest.

A version of this article appeared on Medscape.com.

Recommended Reading

ACP sticks with 50 as age to start CRC screening
MDedge Family Medicine
Vegetarian diets tied to lower risk for some GI cancers
MDedge Family Medicine
Jury out on how tea drinking influences colorectal cancer risk
MDedge Family Medicine
Mixed CRC screening messaging. Confusing? Some docs think so
MDedge Family Medicine
Study reveals potentially unnecessary CRC screening in older adults
MDedge Family Medicine
Breastfeeding and colorectal cancer
MDedge Family Medicine
Jury still out on whether green tea lowers colon cancer risk
MDedge Family Medicine
Are liquid biopsy tests cost-effective for CRC screening?
MDedge Family Medicine
Colorectal Cancer Risk Increasing Across Successive Birth Cohorts
MDedge Family Medicine
Should CRC Surveillance Extend Beyond 5 Years Post Surgery?
MDedge Family Medicine