Can ChatGPT replace diabetes educators? Perhaps not yet

Publish date: April 4, 2023

ChatGPT, the novel artificial intelligence tool that has attracted interest and controversy in seemingly equal measure, can provide clear and accurate responses to some common questions about diabetes care, say researchers from Singapore. But they also have some reservations.

Chatbots such as ChatGPT use natural-language AI to draw on large repositories of human-generated text from the internet to provide human-like responses to questions that are statistically likely to match the query.

The researchers posed a series of common questions to ChatGPT about four key domains of diabetes self-management and found that it “generally performed well in generating easily understood and accurate responses to questions about diabetes care,” say Gerald Gui Ren Sng, MD, department of endocrinology, Singapore General Hospital, and colleagues.

Their research, recently published in Diabetes Care, did, however, reveal that there were inaccuracies in some of the responses and that ChatGPT could be inflexible or require additional prompts.

ChatGPT not trained on medical databases

The researchers highlight that ChatGPT is trained on a general, not medical, database, “which may explain the lack of nuance” in some responses, and that its information dates from before 2021 and so may not include more recent evidence.

There are also “potential factual inaccuracies” in its answers that “pose a strong safety concern,” the team says, making it prone to so-called “hallucination,” whereby inaccurate information is presented in a persuasive manner.

Dr. Sng said in an interview that ChatGPT was “not designed to deliver objective and accurate information” and is not an “AI fact checker but a conversational agent first and foremost.”

“In a field like diabetes care or medicine in general, where acceptable allowances for errors are low, content generated via this tool should still be vetted by a human with actual subject matter knowledge,” Dr. Sng emphasized.

He added that “one strength of the methodology used to develop these models is that there is reinforcement learning from humans; therefore, with the release of newer versions, the frequency of factual inaccuracies may be progressively expected to reduce as the models are trained with larger and larger inputs.”

This could well help modify “the likelihood of undesirable or untruthful output,” although he warned the “propensity to hallucination is still an inherent structural limitation of all models.”

Advise patients

“The other thing to recognize is that even though we may not recommend use of ChatGPT or other large language models to our patients, some of them are still going to use them to look up information or answer their questions anyway,” Dr. Sng observed.

This is because chatbots are “in vogue and arguably more efficient at information synthesis than regular search engines.”

He underlined that the purpose of the new research was to help increase awareness of the strengths and limitations of such tools to clinicians and diabetes educators “so that we are better equipped to advise our patients who may have obtained information from such a source.”

“In the same way ... [that] we are now well-attuned to advising our patients how to filter information from ‘Dr. Google,’ perhaps a better understanding of ‘Dr. ChatGPT’ will also be useful moving forward,” Dr. Sng added.

Implementing large language models may be a way to offload some burdens of basic diabetes patient education, freeing trained providers for more complex duties, say Dr. Sng and colleagues.

Can ChatGPT replace diabetes educators? Perhaps not yet

ChatGPT not trained on medical databases

Advise patients

Pages

Recommended Reading