Article Type
Changed
Mon, 03/22/2021 - 14:08

Smartphone applications (apps) using so-called artificial intelligence (AI) aimed at the general public for use on suspicious skin lesions are unreliable, said U.K. researchers reporting a systematic review.
 

These apps are providing information that could lead to “potentially life-or-death decisions,” commented co-lead author Hywel C. Williams, MD, from the Centre of Evidence Based Dermatology, University of Nottingham (England).

“The one thing you mustn’t do in a situation where early diagnosis can make a difference between life and death is you mustn’t miss the melanoma,” he said in an interview.

“These apps were missing melanomas, and that’s very worrisome,” he commented.

The review included nine studies of skin cancer smartphone apps, including two apps, SkinScan and SkinVision, that have been given Conformit Europenne (CE) marks, allowing them to be marketed across Europe. These apps are also available in Australia and New Zealand, but not in the United States.

The review found that SkinScan was not able to identify any melanomas in the one study that assessed this app, while SkinVision had a relatively low sensitivity and specificity, with 12% of cancerous or precancerous lesions missed and 21% of benign lesions wrongly identified as cancerous.

This means that among 1,000 people with a melanoma prevalence of 3%, 4 of 30 melanomas would be missed, and 200 people would be incorrectly told that a mole was of high concern, the authors estimated.

The research was published by The BMJ on Feb. 10.

“Although I was broad minded on the potential benefit of apps for diagnosing skin cancer, I am now worried given the results of our study and the overall poor quality of studies used to test these apps,” Dr. Williams commented in a statement.

Coauthor Jac Dinnes, PhD, from the Institute of Applied Health Research at the University of Birmingham (England), added it is “really disappointing that there is not better quality evidence available to judge the efficacy of these apps.”

“It is vital that health care professionals are aware of the current limitations both in the technologies and in their evaluations,” she added.

The results also highlight the limitations of the regulatory system governing smartphone apps in that they are currently not subject to assessment by bodies such as the U.K.’s Medicines and Healthcare Products Regulatory Agency (MHRA), the authors commented.

“Regulators need to become alert to the potential harm that poorly performing algorithm-based diagnostic or risk monitoring apps create,” said co-lead author Jonathan J. Deeks, PhD, also at the Institute of Applied Health Research.

“We rely on the CE mark as a sign of quality, but the current CE mark assessment processes are not fit for protecting the public against the risks that these apps present.”

Speaking in an interview, Williams lamented the poor quality of the research that had been conducted. “These studies were not good enough,” he said, adding that “there’s no excuse for really poor study design and poor reporting.”

He would like to see the regulations tightened around AI apps purporting to inform decision making for the general public and suggests that these devices should be assessed by the MHRA. “I really do think a CE mark is not enough,” he said.

The team noted that the skin cancer apps “all include disclaimers that the results should only be used as a guide and cannot replace health care advice,” through which the manufacturers “attempt to evade any responsibility for negative outcomes experienced by users.”

Nevertheless, the “poor and variable performance” of the apps revealed by their review indicates that they “have not yet shown sufficient promise to recommend their use,” they concluded.

The “official approval” implied by a CE mark “will give consumers the impression that the apps have been assessed as effective and safe,” wrote Ben Goldacre, DataLab director, Nuffield Department of Primary Care, University of Oxford (England), and colleagues in an accompanying editorial.

“The implicit assumption is that apps are similarly low-risk technology” to devices such as sticking plasters and reading glasses, they comment.

“But shortcomings in diagnostic apps can have serious implications,” they warn. The “risks include psychological harm from health anxiety or ‘cyberchondria,’ and physical harm from misdiagnosis or overdiagnosis; for clinicians there is a risk of increased workload, and changes to ethical or legal responsibilities around triage, referral, diagnosis, and treatment.” There is also potential for “inappropriate resource use, and even loss of credibility for digital technology in general.”

 

 

Details of the review

For their review, the authors searched the Cochrane Central Register on Controlled Trials, the MEDLNE, Embase, Cumulative Index to Nursing and Allied Health Literature, Conference Proceedings Citation index, Zetoc, and Science Citation Index databases, and online trial registers for studies published between August 2016 and April 2019.

From 80 studies identified, 9 met the eligibility criteria.

Of those, six studies, evaluating a total of 725 skin lesions, determined the accuracy of smartphone apps in risk stratifying suspicious skin lesions by comparing them against a histopathological reference standard diagnosis or expert follow-up.

Five of these studies aimed to detect only melanoma, while one sought to differentiate between malignant or premalignant lesions (including melanoma, basal cell carcinoma, and squamous cell carcinoma) and benign lesions.

The three remaining studies, which evaluated 407 lesions in all, compared smartphone app recommendations against a reference standard of expert recommendations for further investigation or intervention.

The researchers found the studies had a string of potential biases and limitations.

For example, only four studies recruited a consecutive sample of study participants and lesions, and only two included lesions selected by study participants, whereas five studies used lesions that had been selected by a clinician.

Three studies reported that it took 5-10 attempts to obtain an adequate image. In seven studies, it was the researchers and not the patients who used the app to photograph the lesions, and two studies used images obtained from dermatology databases.

This “raised concerns that the results of the studies were unlikely to be representative of real life use,” the authors comment.

In addition, the exclusion of unevaluable images “might have systematically inflated the diagnostic performance of the tested apps,” they add.

The independent research was supported by the National Institute for Health Research (NIHR) Birmingham Biomedical Research Centre at the University Hospitals Birmingham NHS Foundation Trust and the University of Birmingham and is an update of one of a collection of reviews funded by the NIHR through its Cochrane Systematic Review Programme Grant.
 

This article first appeared on Medscape.com.

Publications
Topics
Sections

Smartphone applications (apps) using so-called artificial intelligence (AI) aimed at the general public for use on suspicious skin lesions are unreliable, said U.K. researchers reporting a systematic review.
 

These apps are providing information that could lead to “potentially life-or-death decisions,” commented co-lead author Hywel C. Williams, MD, from the Centre of Evidence Based Dermatology, University of Nottingham (England).

“The one thing you mustn’t do in a situation where early diagnosis can make a difference between life and death is you mustn’t miss the melanoma,” he said in an interview.

“These apps were missing melanomas, and that’s very worrisome,” he commented.

The review included nine studies of skin cancer smartphone apps, including two apps, SkinScan and SkinVision, that have been given Conformit Europenne (CE) marks, allowing them to be marketed across Europe. These apps are also available in Australia and New Zealand, but not in the United States.

The review found that SkinScan was not able to identify any melanomas in the one study that assessed this app, while SkinVision had a relatively low sensitivity and specificity, with 12% of cancerous or precancerous lesions missed and 21% of benign lesions wrongly identified as cancerous.

This means that among 1,000 people with a melanoma prevalence of 3%, 4 of 30 melanomas would be missed, and 200 people would be incorrectly told that a mole was of high concern, the authors estimated.

The research was published by The BMJ on Feb. 10.

“Although I was broad minded on the potential benefit of apps for diagnosing skin cancer, I am now worried given the results of our study and the overall poor quality of studies used to test these apps,” Dr. Williams commented in a statement.

Coauthor Jac Dinnes, PhD, from the Institute of Applied Health Research at the University of Birmingham (England), added it is “really disappointing that there is not better quality evidence available to judge the efficacy of these apps.”

“It is vital that health care professionals are aware of the current limitations both in the technologies and in their evaluations,” she added.

The results also highlight the limitations of the regulatory system governing smartphone apps in that they are currently not subject to assessment by bodies such as the U.K.’s Medicines and Healthcare Products Regulatory Agency (MHRA), the authors commented.

“Regulators need to become alert to the potential harm that poorly performing algorithm-based diagnostic or risk monitoring apps create,” said co-lead author Jonathan J. Deeks, PhD, also at the Institute of Applied Health Research.

“We rely on the CE mark as a sign of quality, but the current CE mark assessment processes are not fit for protecting the public against the risks that these apps present.”

Speaking in an interview, Williams lamented the poor quality of the research that had been conducted. “These studies were not good enough,” he said, adding that “there’s no excuse for really poor study design and poor reporting.”

He would like to see the regulations tightened around AI apps purporting to inform decision making for the general public and suggests that these devices should be assessed by the MHRA. “I really do think a CE mark is not enough,” he said.

The team noted that the skin cancer apps “all include disclaimers that the results should only be used as a guide and cannot replace health care advice,” through which the manufacturers “attempt to evade any responsibility for negative outcomes experienced by users.”

Nevertheless, the “poor and variable performance” of the apps revealed by their review indicates that they “have not yet shown sufficient promise to recommend their use,” they concluded.

The “official approval” implied by a CE mark “will give consumers the impression that the apps have been assessed as effective and safe,” wrote Ben Goldacre, DataLab director, Nuffield Department of Primary Care, University of Oxford (England), and colleagues in an accompanying editorial.

“The implicit assumption is that apps are similarly low-risk technology” to devices such as sticking plasters and reading glasses, they comment.

“But shortcomings in diagnostic apps can have serious implications,” they warn. The “risks include psychological harm from health anxiety or ‘cyberchondria,’ and physical harm from misdiagnosis or overdiagnosis; for clinicians there is a risk of increased workload, and changes to ethical or legal responsibilities around triage, referral, diagnosis, and treatment.” There is also potential for “inappropriate resource use, and even loss of credibility for digital technology in general.”

 

 

Details of the review

For their review, the authors searched the Cochrane Central Register on Controlled Trials, the MEDLNE, Embase, Cumulative Index to Nursing and Allied Health Literature, Conference Proceedings Citation index, Zetoc, and Science Citation Index databases, and online trial registers for studies published between August 2016 and April 2019.

From 80 studies identified, 9 met the eligibility criteria.

Of those, six studies, evaluating a total of 725 skin lesions, determined the accuracy of smartphone apps in risk stratifying suspicious skin lesions by comparing them against a histopathological reference standard diagnosis or expert follow-up.

Five of these studies aimed to detect only melanoma, while one sought to differentiate between malignant or premalignant lesions (including melanoma, basal cell carcinoma, and squamous cell carcinoma) and benign lesions.

The three remaining studies, which evaluated 407 lesions in all, compared smartphone app recommendations against a reference standard of expert recommendations for further investigation or intervention.

The researchers found the studies had a string of potential biases and limitations.

For example, only four studies recruited a consecutive sample of study participants and lesions, and only two included lesions selected by study participants, whereas five studies used lesions that had been selected by a clinician.

Three studies reported that it took 5-10 attempts to obtain an adequate image. In seven studies, it was the researchers and not the patients who used the app to photograph the lesions, and two studies used images obtained from dermatology databases.

This “raised concerns that the results of the studies were unlikely to be representative of real life use,” the authors comment.

In addition, the exclusion of unevaluable images “might have systematically inflated the diagnostic performance of the tested apps,” they add.

The independent research was supported by the National Institute for Health Research (NIHR) Birmingham Biomedical Research Centre at the University Hospitals Birmingham NHS Foundation Trust and the University of Birmingham and is an update of one of a collection of reviews funded by the NIHR through its Cochrane Systematic Review Programme Grant.
 

This article first appeared on Medscape.com.

Smartphone applications (apps) using so-called artificial intelligence (AI) aimed at the general public for use on suspicious skin lesions are unreliable, said U.K. researchers reporting a systematic review.
 

These apps are providing information that could lead to “potentially life-or-death decisions,” commented co-lead author Hywel C. Williams, MD, from the Centre of Evidence Based Dermatology, University of Nottingham (England).

“The one thing you mustn’t do in a situation where early diagnosis can make a difference between life and death is you mustn’t miss the melanoma,” he said in an interview.

“These apps were missing melanomas, and that’s very worrisome,” he commented.

The review included nine studies of skin cancer smartphone apps, including two apps, SkinScan and SkinVision, that have been given Conformit Europenne (CE) marks, allowing them to be marketed across Europe. These apps are also available in Australia and New Zealand, but not in the United States.

The review found that SkinScan was not able to identify any melanomas in the one study that assessed this app, while SkinVision had a relatively low sensitivity and specificity, with 12% of cancerous or precancerous lesions missed and 21% of benign lesions wrongly identified as cancerous.

This means that among 1,000 people with a melanoma prevalence of 3%, 4 of 30 melanomas would be missed, and 200 people would be incorrectly told that a mole was of high concern, the authors estimated.

The research was published by The BMJ on Feb. 10.

“Although I was broad minded on the potential benefit of apps for diagnosing skin cancer, I am now worried given the results of our study and the overall poor quality of studies used to test these apps,” Dr. Williams commented in a statement.

Coauthor Jac Dinnes, PhD, from the Institute of Applied Health Research at the University of Birmingham (England), added it is “really disappointing that there is not better quality evidence available to judge the efficacy of these apps.”

“It is vital that health care professionals are aware of the current limitations both in the technologies and in their evaluations,” she added.

The results also highlight the limitations of the regulatory system governing smartphone apps in that they are currently not subject to assessment by bodies such as the U.K.’s Medicines and Healthcare Products Regulatory Agency (MHRA), the authors commented.

“Regulators need to become alert to the potential harm that poorly performing algorithm-based diagnostic or risk monitoring apps create,” said co-lead author Jonathan J. Deeks, PhD, also at the Institute of Applied Health Research.

“We rely on the CE mark as a sign of quality, but the current CE mark assessment processes are not fit for protecting the public against the risks that these apps present.”

Speaking in an interview, Williams lamented the poor quality of the research that had been conducted. “These studies were not good enough,” he said, adding that “there’s no excuse for really poor study design and poor reporting.”

He would like to see the regulations tightened around AI apps purporting to inform decision making for the general public and suggests that these devices should be assessed by the MHRA. “I really do think a CE mark is not enough,” he said.

The team noted that the skin cancer apps “all include disclaimers that the results should only be used as a guide and cannot replace health care advice,” through which the manufacturers “attempt to evade any responsibility for negative outcomes experienced by users.”

Nevertheless, the “poor and variable performance” of the apps revealed by their review indicates that they “have not yet shown sufficient promise to recommend their use,” they concluded.

The “official approval” implied by a CE mark “will give consumers the impression that the apps have been assessed as effective and safe,” wrote Ben Goldacre, DataLab director, Nuffield Department of Primary Care, University of Oxford (England), and colleagues in an accompanying editorial.

“The implicit assumption is that apps are similarly low-risk technology” to devices such as sticking plasters and reading glasses, they comment.

“But shortcomings in diagnostic apps can have serious implications,” they warn. The “risks include psychological harm from health anxiety or ‘cyberchondria,’ and physical harm from misdiagnosis or overdiagnosis; for clinicians there is a risk of increased workload, and changes to ethical or legal responsibilities around triage, referral, diagnosis, and treatment.” There is also potential for “inappropriate resource use, and even loss of credibility for digital technology in general.”

 

 

Details of the review

For their review, the authors searched the Cochrane Central Register on Controlled Trials, the MEDLNE, Embase, Cumulative Index to Nursing and Allied Health Literature, Conference Proceedings Citation index, Zetoc, and Science Citation Index databases, and online trial registers for studies published between August 2016 and April 2019.

From 80 studies identified, 9 met the eligibility criteria.

Of those, six studies, evaluating a total of 725 skin lesions, determined the accuracy of smartphone apps in risk stratifying suspicious skin lesions by comparing them against a histopathological reference standard diagnosis or expert follow-up.

Five of these studies aimed to detect only melanoma, while one sought to differentiate between malignant or premalignant lesions (including melanoma, basal cell carcinoma, and squamous cell carcinoma) and benign lesions.

The three remaining studies, which evaluated 407 lesions in all, compared smartphone app recommendations against a reference standard of expert recommendations for further investigation or intervention.

The researchers found the studies had a string of potential biases and limitations.

For example, only four studies recruited a consecutive sample of study participants and lesions, and only two included lesions selected by study participants, whereas five studies used lesions that had been selected by a clinician.

Three studies reported that it took 5-10 attempts to obtain an adequate image. In seven studies, it was the researchers and not the patients who used the app to photograph the lesions, and two studies used images obtained from dermatology databases.

This “raised concerns that the results of the studies were unlikely to be representative of real life use,” the authors comment.

In addition, the exclusion of unevaluable images “might have systematically inflated the diagnostic performance of the tested apps,” they add.

The independent research was supported by the National Institute for Health Research (NIHR) Birmingham Biomedical Research Centre at the University Hospitals Birmingham NHS Foundation Trust and the University of Birmingham and is an update of one of a collection of reviews funded by the NIHR through its Cochrane Systematic Review Programme Grant.
 

This article first appeared on Medscape.com.

Publications
Publications
Topics
Article Type
Sections
Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Medscape Article