Latest News

Artificial Intelligence Helps Diagnose Lung Disease in Infants and Outperforms Trainee Doctors


 

FROM ERS 2024

— Artificial Intelligence (AI) can assist doctors in assessing and diagnosing respiratory illnesses in infants and children, according to two new studies presented at the European Respiratory Society (ERS) 2024 Congress.

Researchers can train artificial neural networks (ANNs) to detect lung disease in premature babies by analyzing their breathing patterns while they sleep. “Our noninvasive test is less distressing for the baby and their parents, meaning they can access treatment more quickly, and may also be relevant for their long-term prognosis,” said Edgar Delgado-Eckert, PhD, adjunct professor in the Department of Biomedical Engineering at The University of Basel, Switzerland, and a research group leader at the University Children’s Hospital, Switzerland.

Manjith Narayanan, MD, a consultant in pediatric pulmonology at the Royal Hospital for Children and Young People, Edinburgh, and honorary senior clinical lecturer at The University of Edinburgh, United Kingdom, said chatbots such as ChatGPT, Bard, and Bing can perform as well as or better than trainee doctors when assessing children with respiratory issues. He said chatbots could triage patients more quickly and ease pressure on health services.

Chatbots Show Promise in Triage of Pediatric Respiratory Illnesses

Researchers at The University of Edinburgh provided 10 trainee doctors with less than 4 months of clinical experience in pediatrics with clinical scenarios that covered topics such as cystic fibrosis, asthma, sleep-disordered breathing, breathlessness, chest infections, or no obvious diagnosis.

The trainee doctors had 1 hour to use the internet, although they were not allowed to use chatbots to solve each scenario with a descriptive answer.

Each scenario was also presented to the three large language models (LLMs): OpenAI’s ChatGPT, Google’s Bard, and Microsoft’s Bing.

Six pediatric respiratory experts assessed all responses, scoring correctness, comprehensiveness, usefulness, plausibility, and coherence on a scale of 0-9. They were also asked to say whether they thought a human or a chatbot generated each response.

ChatGPT scored an average of 7 out of 9 overall and was believed to be more human-like than responses from the other chatbots. Bard scored an average of 6 out of 9 and was more “coherent” than trainee doctors, but in other respects, it was no better or worse than trainee doctors. Bing and trainee doctors scored an average of 4 out of 9. The six pediatricians reliably identified Bing and Bard’s responses as nonhuman.

“Our study is the first, to our knowledge, to test LLMs against trainee doctors in situations that reflect real-life clinical practice,” Narayanan said. “We did this by allowing the trainee doctors to have full access to resources available on the internet, as they would in real life. This moves the focus away from testing memory, where LLMs have a clear advantage.”

Narayanan said that these models could help nurses, trainee doctors, and primary care physicians triage patients quickly and assist medical professionals in their studies by summarizing their thought processes. “The key word, though, is “assist.” They cannot replace conventional medical training yet,” he told Medscape Medical News.

The researchers found no obvious hallucinations — seemingly made-up information — with any of the three LLMs. Still, Narayanan said, “We need to be aware of this possibility and build mitigations.”

Hilary Pinnock, ERS education council chair and professor of primary care respiratory medicine at The University of Edinburgh who was not involved in the research, said seeing how widely available AI tools can provide solutions to complex cases of respiratory illness in children is exciting and worrying at the same time. “It certainly points the way to a brave new world of AI-supported care.”

“However, before we start to use AI in routine clinical practice, we need to be confident that it will not create errors either through ‘hallucinating’ fake information or because it has been trained on data that does not equitably represent the population we serve,” she said.

Pages

Next Article:

Pediatricians Must Prepare for Impact on Allergies and Asthma From Climate Change