, according to investigators.
These findings suggest that SCORE-AI, the model tested, can reliably interpret common EEGs in real-world practice, supporting its recent FDA approval, reported lead author Daniel Mansilla, MD, a neurologist at Montreal Neurological Institute and Hospital, and colleagues.
“Overinterpretation of clinical EEG is the most common cause of misdiagnosing epilepsy,” the investigators wrote in Epilepsia. “AI tools may be a solution for this challenge, both as an additional resource for confirmation and classification of epilepsy, and as an aid for the interpretation of EEG in critical care medicine.”
To date, however, AI tools have struggled with the variability encountered in real-world neurology practice.“When tested on external data from different centers and diverse patient populations, and using equipment distinct from the initial study, medical AI models frequently exhibit modest performance, and only a few AI tools have successfully transitioned into medical practice,” the investigators wrote.
SCORE-AI Matches Expert Interpretation of Routine EEGs
The present study put SCORE-AI to the test with EEGs from 104 patients between 16 and 91 years. These individuals hailed from “geographically distinct” regions, while recording equipment and conditions also varied widely, according to Dr. Mansilla and colleagues.
To set an external gold-standard for comparison, EEGs were first interpreted by three human expert raters, who were blinded to all case information except the EEGs themselves. The dataset comprised 50% normal and 50% abnormal EEGs. Four major classes of EEG abnormalities were included: focal epileptiform, generalized epileptiform, focal nonepileptiform, and diffuse nonepileptiform.
Comparing SCORE-AI interpretations with the experts’ interpretations revealed no significant difference in any metric or category. The AI tool had an overall accuracy of 92%, compared with 94% for the human experts. Of note, SCORE-AI maintained this level of performance regardless of vigilance state or normal variants.
“SCORE-AI has obtained FDA approval for routine clinical EEGs and is presently being integrated into broadly available EEG software (Natus NeuroWorks),” the investigators wrote.
Further Validation May Be Needed
Wesley T. Kerr, MD, PhD, functional (nonepileptic) seizures clinic lead epileptologist at the University of Pittsburgh Medical Center, and handling associate editor for this study in Epilepsia, said the present findings are important because they show that SCORE-AI can perform in scenarios beyond the one in which it was developed.
Still, it may be premature for broad commercial rollout.
In a written comment, Dr. Kerr called for “much larger studies” to validate SCORE-AI, noting that seizures can be caused by “many rare conditions,” and some patients have multiple EEG abnormalities.
Since SCORE-AI has not yet demonstrated accuracy in those situations, he predicted that the tool will remain exactly that – a tool – before it replaces human experts.
“They have only looked at SCORE-AI by itself,” Dr. Kerr said. “Practically, SCORE-AI is going to be used in combination with a neurologist for a long time before SCORE-AI can operate semi-independently or independently. They need to do studies looking at this combination to see how this tool impacts the clinical practice of EEG interpretation.”
Daniel Friedman, MD, an epileptologist and associate clinical professor of neurology at NYU Langone, pointed out another limitation of the present study: The EEGs were collected at specialty centers.
“The technical standards of data collection were, therefore, pretty high,” Dr. Friedman said in a written comment. “The majority of EEGs performed in the world are not collected by highly skilled EEG technologists and the performance of AI classification algorithms under less-than-ideal technical conditions is unknown.”