Implementing Trustworthy AI in VA High Reliability Health Care Organizations

Article Type
Changed
Thu, 02/01/2024 - 11:46

Artificial intelligence (AI) has lagged in health care but has considerable potential to improve quality, safety, clinician experience, and access to care. It is being tested in areas like billing, hospital operations, and preventing adverse events (eg, sepsis mortality) with some early success. However, there are still many barriers preventing the widespread use of AI, such as data problems, mismatched rewards, and workplace obstacles. Innovative projects, partnerships, better rewards, and more investment could remove barriers. Implemented reliably and safely, AI can add to what clinicians know, help them work faster, cut costs, and, most importantly, improve patient care.1

AI can potentially bring several clinical benefits, such as reducing the administrative strain on clinicians and granting them more time for direct patient care. It can also improve diagnostic accuracy by analyzing patient data and diagnostic images, providing differential diagnoses, and increasing access to care by providing medical information and essential online services to patients.2

High Reliability Organizations

table_1.png

High reliability health care organizations have considerable experience safely launching new programs. For example, the Patient Safety Adoption Framework gives practical tips for smoothly rolling out safety initiatives (Table 1). Developed with experts and diverse views, this framework has 5 key areas: leadership, culture and context, process, measurement, and person-centeredness. These address adoption problems, guide leaders step-by-step, and focus on leadership buy-in, safety culture, cooperation, and local customization. Checklists and tools make it systematic to go from ideas to action on patient safety.3

Leadership involves establishing organizational commitment behind new safety programs. This visible commitment signals importance and priorities to others. Leaders model desired behaviors and language around safety, allocate resources, remove obstacles, and keep initiatives energized over time through consistent messaging.4 Culture and context recognizes that safety culture differs across units and facilities. Local input tailors programs to fit and examines strengths to build on, like psychological safety. Surveys gauge the existing culture and its need for change. Process details how to plan, design, test, implement, and improve new safety practices and provides a phased roadmap from idea to results. Measurement collects data to drive improvement and show impact. Metrics track progress and allow benchmarking. Person-centeredness puts patients first in safety efforts through participation, education, and transparency.

The Veterans Health Administration piloted a comprehensive high reliability hospital (HRH) model. Over 3 years, the Veterans Health Administration focused on leadership, culture, and process improvement at a hospital. After initiating the model, the pilot hospital improved its safety culture, reported more minor safety issues, and reduced deaths and complications better than other hospitals. The high-reliability approach successfully instilled principles and improved culture and outcomes. The HRH model is set to be expanded to 18 more US Department of Veterans Affairs (VA) sites for further evaluation across diverse settings.5

 

 

Trustworthy AI Framework

table_2.png

AI systems are growing more powerful and widespread, including in health care. Unfortunately, irresponsible AI can introduce new harm. ChatGPT and other large language models, for example, sometimes are known to provide erroneous information in a compelling way. Clinicians and patients who use such programs can act on such information, which would lead to unforeseen negative consequences. Several frameworks on ethical AI have come from governmental groups.6-9 In 2023, the VA National AI Institute suggested a Trustworthy AI Framework based on core principles tailored for federal health care. The framework has 6 key principles: purposeful, effective and safe, secure and private, fair and equitable, transparent and explainable, and accountable and monitored (Table 2).10

First, AI must clearly help veterans while minimizing risks. To ensure purpose, the VA will assess patient and clinician needs and design AI that targets meaningful problems to avoid scope creep or feature bloat. For example, adding new features to the AI software after release can clutter and complicate the interface, making it difficult to use. Rigorous testing will confirm that AI meets intent prior to deployment. Second, AI is designed and checked for effectiveness, safety, and reliability. The VA pledges to monitor AI’s impact to ensure it performs as expected without unintended consequences. Algorithms will be stress tested across representative datasets and approval processes will screen for safety issues. Third, AI models are secured from vulnerabilities and misuse. Technical controls will prevent unauthorized access or changes to AI systems. Audits will check for appropriate internal usage per policies. Continual patches and upgrades will maintain security. Fourth, the VA manages AI for fairness, avoiding bias. They will proactively assess datasets and algorithms for potential biases based on protected attributes like race, gender, or age. Biased outputs will be addressed through techniques such as data augmentation, reweighting, and algorithm tweaks. Fifth, transparency explains AI’s role in care. Documentation will detail an AI system’s data sources, methodology, testing, limitations, and integration with clinical workflows. Clinicians and patients will receive education on interpreting AI outputs. Finally, the VA pledges to closely monitor AI systems to sustain trust. The VA will establish oversight processes to quickly identify any declines in reliability or unfair impacts on subgroups. AI models will be retrained as needed based on incoming data patterns.

Each Trustworthy AI Framework principle connects to others in existing frameworks. The purpose principle aligns with human-centric AI focused on benefits. Effectiveness and safety link to technical robustness and risk management principles. Security maps to privacy protection principles. Fairness connects to principles of avoiding bias and discrimination. Transparency corresponds with accountable and explainable AI. Monitoring and accountability tie back to governance principles. Overall, the VA framework aims to guide ethical AI based on context. It offers a model for managing risks and building trust in health care AI.

Combining VA principles with high-reliability safety principles can ensure that AI benefits veterans. The leadership and culture aspects will drive commitment to trustworthy AI practices. Leaders will communicate the importance of responsible AI through words and actions. Culture surveys can assess baseline awareness of AI ethics issues to target education. AI security and fairness will be emphasized as safety critical. The process aspect will institute policies and procedures to uphold AI principles through the project lifecycle. For example, structured testing processes will validate safety. Measurement will collect data on principles like transparency and fairness. Dashboards can track metrics like explainability and biases. A patient-centered approach will incorporate veteran perspectives on AI through participatory design and advisory councils. They can give input on AI explainability and potential biases based on their diverse backgrounds.

Conclusions

Joint principles will lead to successful AI that improves care while proactively managing risks. Involve leaders to stress the necessity of eliminating biases. Build security into the AI development process. Co-design AI transparency features with end users. Closely monitor the impact of AI across safety, fairness, and other principles. Adhering to both Trustworthy AI and high reliability organizations principles will earn veterans’ confidence. Health care organizations like the VA can integrate ethical AI safely via established frameworks. With responsible design and implementation, AI’s potential to enhance care quality, safety, and access can be realized.

Acknowledgments

We would like to acknowledge Joshua Mueller, Theo Tiffney, John Zachary, and Gil Alterovitz for their excellent work creating the VA Trustworthy Principles. This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.

References

1. Sahni NR, Carrus B. Artificial intelligence in U.S. health care delivery. N Engl J Med. 2023;389(4):348-358. doi:10.1056/NEJMra2204673

2. Borkowski AA, Jakey CE, Mastorides SM, et al. Applications of ChatGPT and large language models in medicine and health care: benefits and pitfalls. Fed Pract. 2023;40(6):170-173. doi:10.12788/fp.0386

3. Moyal-Smith R, Margo J, Maloney FL, et al. The patient safety adoption framework: a practical framework to bridge the know-do gap. J Patient Saf. 2023;19(4):243-248. doi:10.1097/PTS.0000000000001118

4. Isaacks DB, Anderson TM, Moore SC, Patterson W, Govindan S. High reliability organization principles improve VA workplace burnout: the Truman THRIVE2 model. Am J Med Qual. 2021;36(6):422-428. doi:10.1097/01.JMQ.0000735516.35323.97

5. Sculli GL, Pendley-Louis R, Neily J, et al. A high-reliability organization framework for health care: a multiyear implementation strategy and associated outcomes. J Patient Saf. 2022;18(1):64-70. doi:10.1097/PTS.0000000000000788

6. National Institute of Standards and Technology. AI risk management framework. Accessed January 2, 2024. https://www.nist.gov/itl/ai-risk-management-framework

7. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed January 11, 2024. https://www.whitehouse.gov/ostp/ai-bill-of-rights

8. Executive Office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. Fed Regist. 2020;89(236):78939-78943.

9. Biden JR. Executive Order on the safe, secure, and trustworthy development and use of artificial intelligence. Published October 30, 2023. Accessed January 11, 2024. https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/

10. US Department of Veterans Affairs. Trustworthy AI. Accessed January 11, 2024. https://department.va.gov/ai/trustworthy/

Article PDF
Author and Disclosure Information

David B. Isaacks, FACHEa; Andrew A. Borkowski, MDa,b,c 

Correspondence:  Andrew Borkowski  (andrew.borkowski@va.gov)

aVeterans Affairs Sunshine Healthcare Network, Tampa, Florida

bUniversity of South Florida Morsani College of Medicine, Tampa

cVeterans Affairs National Artificial Intelligence Institute

Author disclosures

The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.

Disclaimer

The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Issue
Federal Practitioner - 41(2)a
Publications
Topics
Page Number
40
Sections
Author and Disclosure Information

David B. Isaacks, FACHEa; Andrew A. Borkowski, MDa,b,c 

Correspondence:  Andrew Borkowski  (andrew.borkowski@va.gov)

aVeterans Affairs Sunshine Healthcare Network, Tampa, Florida

bUniversity of South Florida Morsani College of Medicine, Tampa

cVeterans Affairs National Artificial Intelligence Institute

Author disclosures

The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.

Disclaimer

The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Author and Disclosure Information

David B. Isaacks, FACHEa; Andrew A. Borkowski, MDa,b,c 

Correspondence:  Andrew Borkowski  (andrew.borkowski@va.gov)

aVeterans Affairs Sunshine Healthcare Network, Tampa, Florida

bUniversity of South Florida Morsani College of Medicine, Tampa

cVeterans Affairs National Artificial Intelligence Institute

Author disclosures

The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.

Disclaimer

The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Article PDF
Article PDF

Artificial intelligence (AI) has lagged in health care but has considerable potential to improve quality, safety, clinician experience, and access to care. It is being tested in areas like billing, hospital operations, and preventing adverse events (eg, sepsis mortality) with some early success. However, there are still many barriers preventing the widespread use of AI, such as data problems, mismatched rewards, and workplace obstacles. Innovative projects, partnerships, better rewards, and more investment could remove barriers. Implemented reliably and safely, AI can add to what clinicians know, help them work faster, cut costs, and, most importantly, improve patient care.1

AI can potentially bring several clinical benefits, such as reducing the administrative strain on clinicians and granting them more time for direct patient care. It can also improve diagnostic accuracy by analyzing patient data and diagnostic images, providing differential diagnoses, and increasing access to care by providing medical information and essential online services to patients.2

High Reliability Organizations

table_1.png

High reliability health care organizations have considerable experience safely launching new programs. For example, the Patient Safety Adoption Framework gives practical tips for smoothly rolling out safety initiatives (Table 1). Developed with experts and diverse views, this framework has 5 key areas: leadership, culture and context, process, measurement, and person-centeredness. These address adoption problems, guide leaders step-by-step, and focus on leadership buy-in, safety culture, cooperation, and local customization. Checklists and tools make it systematic to go from ideas to action on patient safety.3

Leadership involves establishing organizational commitment behind new safety programs. This visible commitment signals importance and priorities to others. Leaders model desired behaviors and language around safety, allocate resources, remove obstacles, and keep initiatives energized over time through consistent messaging.4 Culture and context recognizes that safety culture differs across units and facilities. Local input tailors programs to fit and examines strengths to build on, like psychological safety. Surveys gauge the existing culture and its need for change. Process details how to plan, design, test, implement, and improve new safety practices and provides a phased roadmap from idea to results. Measurement collects data to drive improvement and show impact. Metrics track progress and allow benchmarking. Person-centeredness puts patients first in safety efforts through participation, education, and transparency.

The Veterans Health Administration piloted a comprehensive high reliability hospital (HRH) model. Over 3 years, the Veterans Health Administration focused on leadership, culture, and process improvement at a hospital. After initiating the model, the pilot hospital improved its safety culture, reported more minor safety issues, and reduced deaths and complications better than other hospitals. The high-reliability approach successfully instilled principles and improved culture and outcomes. The HRH model is set to be expanded to 18 more US Department of Veterans Affairs (VA) sites for further evaluation across diverse settings.5

 

 

Trustworthy AI Framework

table_2.png

AI systems are growing more powerful and widespread, including in health care. Unfortunately, irresponsible AI can introduce new harm. ChatGPT and other large language models, for example, sometimes are known to provide erroneous information in a compelling way. Clinicians and patients who use such programs can act on such information, which would lead to unforeseen negative consequences. Several frameworks on ethical AI have come from governmental groups.6-9 In 2023, the VA National AI Institute suggested a Trustworthy AI Framework based on core principles tailored for federal health care. The framework has 6 key principles: purposeful, effective and safe, secure and private, fair and equitable, transparent and explainable, and accountable and monitored (Table 2).10

First, AI must clearly help veterans while minimizing risks. To ensure purpose, the VA will assess patient and clinician needs and design AI that targets meaningful problems to avoid scope creep or feature bloat. For example, adding new features to the AI software after release can clutter and complicate the interface, making it difficult to use. Rigorous testing will confirm that AI meets intent prior to deployment. Second, AI is designed and checked for effectiveness, safety, and reliability. The VA pledges to monitor AI’s impact to ensure it performs as expected without unintended consequences. Algorithms will be stress tested across representative datasets and approval processes will screen for safety issues. Third, AI models are secured from vulnerabilities and misuse. Technical controls will prevent unauthorized access or changes to AI systems. Audits will check for appropriate internal usage per policies. Continual patches and upgrades will maintain security. Fourth, the VA manages AI for fairness, avoiding bias. They will proactively assess datasets and algorithms for potential biases based on protected attributes like race, gender, or age. Biased outputs will be addressed through techniques such as data augmentation, reweighting, and algorithm tweaks. Fifth, transparency explains AI’s role in care. Documentation will detail an AI system’s data sources, methodology, testing, limitations, and integration with clinical workflows. Clinicians and patients will receive education on interpreting AI outputs. Finally, the VA pledges to closely monitor AI systems to sustain trust. The VA will establish oversight processes to quickly identify any declines in reliability or unfair impacts on subgroups. AI models will be retrained as needed based on incoming data patterns.

Each Trustworthy AI Framework principle connects to others in existing frameworks. The purpose principle aligns with human-centric AI focused on benefits. Effectiveness and safety link to technical robustness and risk management principles. Security maps to privacy protection principles. Fairness connects to principles of avoiding bias and discrimination. Transparency corresponds with accountable and explainable AI. Monitoring and accountability tie back to governance principles. Overall, the VA framework aims to guide ethical AI based on context. It offers a model for managing risks and building trust in health care AI.

Combining VA principles with high-reliability safety principles can ensure that AI benefits veterans. The leadership and culture aspects will drive commitment to trustworthy AI practices. Leaders will communicate the importance of responsible AI through words and actions. Culture surveys can assess baseline awareness of AI ethics issues to target education. AI security and fairness will be emphasized as safety critical. The process aspect will institute policies and procedures to uphold AI principles through the project lifecycle. For example, structured testing processes will validate safety. Measurement will collect data on principles like transparency and fairness. Dashboards can track metrics like explainability and biases. A patient-centered approach will incorporate veteran perspectives on AI through participatory design and advisory councils. They can give input on AI explainability and potential biases based on their diverse backgrounds.

Conclusions

Joint principles will lead to successful AI that improves care while proactively managing risks. Involve leaders to stress the necessity of eliminating biases. Build security into the AI development process. Co-design AI transparency features with end users. Closely monitor the impact of AI across safety, fairness, and other principles. Adhering to both Trustworthy AI and high reliability organizations principles will earn veterans’ confidence. Health care organizations like the VA can integrate ethical AI safely via established frameworks. With responsible design and implementation, AI’s potential to enhance care quality, safety, and access can be realized.

Acknowledgments

We would like to acknowledge Joshua Mueller, Theo Tiffney, John Zachary, and Gil Alterovitz for their excellent work creating the VA Trustworthy Principles. This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.

Artificial intelligence (AI) has lagged in health care but has considerable potential to improve quality, safety, clinician experience, and access to care. It is being tested in areas like billing, hospital operations, and preventing adverse events (eg, sepsis mortality) with some early success. However, there are still many barriers preventing the widespread use of AI, such as data problems, mismatched rewards, and workplace obstacles. Innovative projects, partnerships, better rewards, and more investment could remove barriers. Implemented reliably and safely, AI can add to what clinicians know, help them work faster, cut costs, and, most importantly, improve patient care.1

AI can potentially bring several clinical benefits, such as reducing the administrative strain on clinicians and granting them more time for direct patient care. It can also improve diagnostic accuracy by analyzing patient data and diagnostic images, providing differential diagnoses, and increasing access to care by providing medical information and essential online services to patients.2

High Reliability Organizations

table_1.png

High reliability health care organizations have considerable experience safely launching new programs. For example, the Patient Safety Adoption Framework gives practical tips for smoothly rolling out safety initiatives (Table 1). Developed with experts and diverse views, this framework has 5 key areas: leadership, culture and context, process, measurement, and person-centeredness. These address adoption problems, guide leaders step-by-step, and focus on leadership buy-in, safety culture, cooperation, and local customization. Checklists and tools make it systematic to go from ideas to action on patient safety.3

Leadership involves establishing organizational commitment behind new safety programs. This visible commitment signals importance and priorities to others. Leaders model desired behaviors and language around safety, allocate resources, remove obstacles, and keep initiatives energized over time through consistent messaging.4 Culture and context recognizes that safety culture differs across units and facilities. Local input tailors programs to fit and examines strengths to build on, like psychological safety. Surveys gauge the existing culture and its need for change. Process details how to plan, design, test, implement, and improve new safety practices and provides a phased roadmap from idea to results. Measurement collects data to drive improvement and show impact. Metrics track progress and allow benchmarking. Person-centeredness puts patients first in safety efforts through participation, education, and transparency.

The Veterans Health Administration piloted a comprehensive high reliability hospital (HRH) model. Over 3 years, the Veterans Health Administration focused on leadership, culture, and process improvement at a hospital. After initiating the model, the pilot hospital improved its safety culture, reported more minor safety issues, and reduced deaths and complications better than other hospitals. The high-reliability approach successfully instilled principles and improved culture and outcomes. The HRH model is set to be expanded to 18 more US Department of Veterans Affairs (VA) sites for further evaluation across diverse settings.5

 

 

Trustworthy AI Framework

table_2.png

AI systems are growing more powerful and widespread, including in health care. Unfortunately, irresponsible AI can introduce new harm. ChatGPT and other large language models, for example, sometimes are known to provide erroneous information in a compelling way. Clinicians and patients who use such programs can act on such information, which would lead to unforeseen negative consequences. Several frameworks on ethical AI have come from governmental groups.6-9 In 2023, the VA National AI Institute suggested a Trustworthy AI Framework based on core principles tailored for federal health care. The framework has 6 key principles: purposeful, effective and safe, secure and private, fair and equitable, transparent and explainable, and accountable and monitored (Table 2).10

First, AI must clearly help veterans while minimizing risks. To ensure purpose, the VA will assess patient and clinician needs and design AI that targets meaningful problems to avoid scope creep or feature bloat. For example, adding new features to the AI software after release can clutter and complicate the interface, making it difficult to use. Rigorous testing will confirm that AI meets intent prior to deployment. Second, AI is designed and checked for effectiveness, safety, and reliability. The VA pledges to monitor AI’s impact to ensure it performs as expected without unintended consequences. Algorithms will be stress tested across representative datasets and approval processes will screen for safety issues. Third, AI models are secured from vulnerabilities and misuse. Technical controls will prevent unauthorized access or changes to AI systems. Audits will check for appropriate internal usage per policies. Continual patches and upgrades will maintain security. Fourth, the VA manages AI for fairness, avoiding bias. They will proactively assess datasets and algorithms for potential biases based on protected attributes like race, gender, or age. Biased outputs will be addressed through techniques such as data augmentation, reweighting, and algorithm tweaks. Fifth, transparency explains AI’s role in care. Documentation will detail an AI system’s data sources, methodology, testing, limitations, and integration with clinical workflows. Clinicians and patients will receive education on interpreting AI outputs. Finally, the VA pledges to closely monitor AI systems to sustain trust. The VA will establish oversight processes to quickly identify any declines in reliability or unfair impacts on subgroups. AI models will be retrained as needed based on incoming data patterns.

Each Trustworthy AI Framework principle connects to others in existing frameworks. The purpose principle aligns with human-centric AI focused on benefits. Effectiveness and safety link to technical robustness and risk management principles. Security maps to privacy protection principles. Fairness connects to principles of avoiding bias and discrimination. Transparency corresponds with accountable and explainable AI. Monitoring and accountability tie back to governance principles. Overall, the VA framework aims to guide ethical AI based on context. It offers a model for managing risks and building trust in health care AI.

Combining VA principles with high-reliability safety principles can ensure that AI benefits veterans. The leadership and culture aspects will drive commitment to trustworthy AI practices. Leaders will communicate the importance of responsible AI through words and actions. Culture surveys can assess baseline awareness of AI ethics issues to target education. AI security and fairness will be emphasized as safety critical. The process aspect will institute policies and procedures to uphold AI principles through the project lifecycle. For example, structured testing processes will validate safety. Measurement will collect data on principles like transparency and fairness. Dashboards can track metrics like explainability and biases. A patient-centered approach will incorporate veteran perspectives on AI through participatory design and advisory councils. They can give input on AI explainability and potential biases based on their diverse backgrounds.

Conclusions

Joint principles will lead to successful AI that improves care while proactively managing risks. Involve leaders to stress the necessity of eliminating biases. Build security into the AI development process. Co-design AI transparency features with end users. Closely monitor the impact of AI across safety, fairness, and other principles. Adhering to both Trustworthy AI and high reliability organizations principles will earn veterans’ confidence. Health care organizations like the VA can integrate ethical AI safely via established frameworks. With responsible design and implementation, AI’s potential to enhance care quality, safety, and access can be realized.

Acknowledgments

We would like to acknowledge Joshua Mueller, Theo Tiffney, John Zachary, and Gil Alterovitz for their excellent work creating the VA Trustworthy Principles. This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.

References

1. Sahni NR, Carrus B. Artificial intelligence in U.S. health care delivery. N Engl J Med. 2023;389(4):348-358. doi:10.1056/NEJMra2204673

2. Borkowski AA, Jakey CE, Mastorides SM, et al. Applications of ChatGPT and large language models in medicine and health care: benefits and pitfalls. Fed Pract. 2023;40(6):170-173. doi:10.12788/fp.0386

3. Moyal-Smith R, Margo J, Maloney FL, et al. The patient safety adoption framework: a practical framework to bridge the know-do gap. J Patient Saf. 2023;19(4):243-248. doi:10.1097/PTS.0000000000001118

4. Isaacks DB, Anderson TM, Moore SC, Patterson W, Govindan S. High reliability organization principles improve VA workplace burnout: the Truman THRIVE2 model. Am J Med Qual. 2021;36(6):422-428. doi:10.1097/01.JMQ.0000735516.35323.97

5. Sculli GL, Pendley-Louis R, Neily J, et al. A high-reliability organization framework for health care: a multiyear implementation strategy and associated outcomes. J Patient Saf. 2022;18(1):64-70. doi:10.1097/PTS.0000000000000788

6. National Institute of Standards and Technology. AI risk management framework. Accessed January 2, 2024. https://www.nist.gov/itl/ai-risk-management-framework

7. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed January 11, 2024. https://www.whitehouse.gov/ostp/ai-bill-of-rights

8. Executive Office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. Fed Regist. 2020;89(236):78939-78943.

9. Biden JR. Executive Order on the safe, secure, and trustworthy development and use of artificial intelligence. Published October 30, 2023. Accessed January 11, 2024. https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/

10. US Department of Veterans Affairs. Trustworthy AI. Accessed January 11, 2024. https://department.va.gov/ai/trustworthy/

References

1. Sahni NR, Carrus B. Artificial intelligence in U.S. health care delivery. N Engl J Med. 2023;389(4):348-358. doi:10.1056/NEJMra2204673

2. Borkowski AA, Jakey CE, Mastorides SM, et al. Applications of ChatGPT and large language models in medicine and health care: benefits and pitfalls. Fed Pract. 2023;40(6):170-173. doi:10.12788/fp.0386

3. Moyal-Smith R, Margo J, Maloney FL, et al. The patient safety adoption framework: a practical framework to bridge the know-do gap. J Patient Saf. 2023;19(4):243-248. doi:10.1097/PTS.0000000000001118

4. Isaacks DB, Anderson TM, Moore SC, Patterson W, Govindan S. High reliability organization principles improve VA workplace burnout: the Truman THRIVE2 model. Am J Med Qual. 2021;36(6):422-428. doi:10.1097/01.JMQ.0000735516.35323.97

5. Sculli GL, Pendley-Louis R, Neily J, et al. A high-reliability organization framework for health care: a multiyear implementation strategy and associated outcomes. J Patient Saf. 2022;18(1):64-70. doi:10.1097/PTS.0000000000000788

6. National Institute of Standards and Technology. AI risk management framework. Accessed January 2, 2024. https://www.nist.gov/itl/ai-risk-management-framework

7. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed January 11, 2024. https://www.whitehouse.gov/ostp/ai-bill-of-rights

8. Executive Office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. Fed Regist. 2020;89(236):78939-78943.

9. Biden JR. Executive Order on the safe, secure, and trustworthy development and use of artificial intelligence. Published October 30, 2023. Accessed January 11, 2024. https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/

10. US Department of Veterans Affairs. Trustworthy AI. Accessed January 11, 2024. https://department.va.gov/ai/trustworthy/

Issue
Federal Practitioner - 41(2)a
Issue
Federal Practitioner - 41(2)a
Page Number
40
Page Number
40
Publications
Publications
Topics
Article Type
Sections
Teambase XML
<?xml version="1.0" encoding="UTF-8"?>
<!--$RCSfile: InCopy_agile.xsl,v $ $Revision: 1.35 $-->
<!--$RCSfile: drupal.xsl,v $ $Revision: 1.7 $-->
<root generator="drupal.xsl" gversion="1.7"> <header> <fileName>0224 FED AI</fileName> <TBEID>0C02F0D8.SIG</TBEID> <TBUniqueIdentifier>NJ_0C02F0D8</TBUniqueIdentifier> <newsOrJournal>Journal</newsOrJournal> <publisherName>Frontline Medical Communications Inc.</publisherName> <storyname/> <articleType>1</articleType> <TBLocation>Copyfitting-FED</TBLocation> <QCDate/> <firstPublished>20240201T110553</firstPublished> <LastPublished>20240201T110553</LastPublished> <pubStatus qcode="stat:"/> <embargoDate/> <killDate/> <CMSDate>20240201T110552</CMSDate> <articleSource/> <facebookInfo/> <meetingNumber/> <byline/> <bylineText>David B. Isaacks, FACHEa; Andrew A. Borkowski, MDa,b,c </bylineText> <bylineFull/> <bylineTitleText/> <USOrGlobal/> <wireDocType/> <newsDocType/> <journalDocType/> <linkLabel/> <pageRange/> <citation/> <quizID/> <indexIssueDate/> <itemClass qcode="ninat:text"/> <provider qcode="provider:"> <name/> <rightsInfo> <copyrightHolder> <name/> </copyrightHolder> <copyrightNotice/> </rightsInfo> </provider> <abstract/> <metaDescription>Artificial intelligence (AI) has lagged in health care but has considerable potential to improve quality, safety, clinician experience, and access to care. It i</metaDescription> <articlePDF/> <teaserImage/> <title>Implementing Trustworthy AI in VA High Reliability Health Care Organizations</title> <deck/> <eyebrow>Commentary</eyebrow> <disclaimer/> <AuthorList/> <articleURL/> <doi/> <pubMedID/> <publishXMLStatus/> <publishXMLVersion>1</publishXMLVersion> <useEISSN>0</useEISSN> <urgency/> <pubPubdateYear>2024</pubPubdateYear> <pubPubdateMonth>February</pubPubdateMonth> <pubPubdateDay/> <pubVolume>41</pubVolume> <pubNumber>2</pubNumber> <wireChannels/> <primaryCMSID/> <CMSIDs> <CMSID>2951</CMSID> <CMSID>3729</CMSID> </CMSIDs> <keywords/> <seeAlsos/> <publications_g> <publicationData> <publicationCode>FED</publicationCode> <pubIssueName>February 2024</pubIssueName> <pubArticleType>Columns | 3729</pubArticleType> <pubTopics/> <pubCategories/> <pubSections> <pubSection>Feature | 2951<pubSubsection/></pubSection> </pubSections> <journalTitle>Fed Pract</journalTitle> <journalFullTitle>Federal Practitioner</journalFullTitle> <copyrightStatement>Copyright 2017 Frontline Medical Communications Inc., Parsippany, NJ, USA. All rights reserved.</copyrightStatement> </publicationData> </publications_g> <publications> <term canonical="true">16</term> </publications> <sections> <term canonical="true">52</term> </sections> <topics> <term canonical="true">38029</term> <term>27442</term> </topics> <links/> </header> <itemSet> <newsItem> <itemMeta> <itemRole>Main</itemRole> <itemClass>text</itemClass> <title>Implementing Trustworthy AI in VA High Reliability Health Care Organizations</title> <deck/> </itemMeta> <itemContent> <p class="abstract"><b>Background:</b> Artificial intelligence (AI) has great potential to improve health care quality, safety, efficiency, and access. However, the widespread adoption of health care AI needs to catch up to other sectors. Challenges, including data limitations, misaligned incentives, and organizational obstacles, have hindered implementation. Strategic demonstrations, partnerships, aligned incentives, and continued investment are needed to enable responsible adoption of AI. High reliability health care organizations offer insights into safely implementing major initiatives through frameworks like the Patient Safety Adoption Framework, which provides practical guidance on leadership, culture, process, measurement, and person-centeredness to successfully adopt safety practices. High reliability health care organizations ensure consistently safe and high quality care through a culture focused on reliability, accountability, and learning from errors and near misses.<br/><br/><b>Observations:</b> The Veterans Health Administration applied a high reliability health care model to instill safety principles and improve outcomes. As the use of AI becomes more widespread, ensuring its ethical development is crucial to avoiding new risks and harm. The US Department of Veterans Affairs National AI Institute proposed a Trustworthy AI Framework tailored for federal health care with 6 principles: purposeful, effective and safe, secure and private, fair and equitable, transparent and explainable, and accountable and monitored. This aims to manage risks and build trust. <br/><br/><b>Conclusions:</b> Combining these AI principles with high reliability safety principles can enable successful, trustworthy AI that improves health care quality, safety, efficiency, and access. Overcoming AI adoption barriers will require strategic efforts, partnerships, and investment to implement AI responsibly, safely, and equitably based on the health care context.</p> <p><span class="Drop">A</span>rtificial intelligence (AI) has lagged in health care but has considerable potential to improve quality, safety, clinician experience, and access to care. It is being tested in areas like billing, hospital operations, and preventing adverse events (eg, sepsis mortality) with some early success. However, there are still many barriers preventing the widespread use of AI, such as data problems, mismatched rewards, and workplace obstacles. Innovative projects, partnerships, better rewards, and more investment could remove barriers. Implemented reliably and safely, AI can add to what clinicians know, help them work faster, cut costs, and, most importantly, improve patient care.<sup>1</sup> </p> <p>AI can potentially bring several clinical benefits, such as reducing the administrative strain on clinicians and granting them more time for direct patient care. It can also improve diagnostic accuracy by analyzing patient data and diagnostic images, providing differential diagnoses, and increasing access to care by providing medical information and essential online services to patients.<sup>2</sup></p> <h2>High Reliability Organizations</h2> <p>High reliability health care organizations have considerable experience safely launching new programs. For example, the Patient Safety Adoption Framework gives practical tips for smoothly rolling out safety initiatives (Table 1). Developed with experts and diverse views, this framework has 5 key areas: leadership, culture and context, process, measurement, and person-centeredness. These address adoption problems, guide leaders step-by-step, and focus on leadership buy-in, safety culture, cooperation, and local customization. Checklists and tools make it systematic to go from ideas to action on patient safety.<sup>3</sup></p> <p>Leadership involves establishing organizational commitment behind new safety programs. This visible commitment signals importance and priorities to others. Leaders model desired behaviors and language around safety, allocate resources, remove obstacles, and keep initiatives energized over time through consistent messaging.<sup>4</sup> Culture and context recognizes that safety culture differs across units and facilities. Local input tailors programs to fit and examines strengths to build on, like psychological safety. Surveys gauge the existing culture and its need for change. Process details how to plan, design, test, implement, and improve new safety practices and provides a phased roadmap from idea to results. Measurement collects data to drive improvement and show impact. Metrics track progress and allow benchmarking. Person-centeredness puts patients first in safety efforts through participation, education, and transparency. <br/><br/>The Veterans Health Administration piloted a comprehensive high reliability hospital (HRH) model. Over 3 years, the Veterans Health Administration focused on leadership, culture, and process improvement at a hospital. After initiating the model, the pilot hospital improved its safety culture, reported more minor safety issues, and reduced deaths and complications better than other hospitals. The high-reliability approach successfully instilled principles and improved culture and outcomes. The HRH model is set to be expanded to 18 more US Department of Veterans Affairs (VA) sites for further evaluation across diverse settings.<sup>5</sup></p> <h2>Trustworthy AI Framework</h2> <p>AI systems are growing more powerful and widespread, including in health care. Unfortunately, irresponsible AI can introduce new harm. ChatGPT and other large language models, for example, sometimes are known to provide erroneous information in a compelling way. Clinicians and patients who use such programs can act on such information, which would lead to unforeseen negative consequences. Several frameworks on ethical AI have come from governmental groups.<sup>6-9</sup> In 2023, the VA National AI Institute suggested a Trustworthy AI Framework based on core principles tailored for federal health care. The framework has 6 key principles: purposeful, effective and safe, secure and private, fair and equitable, transparent and explainable, and accountable and monitored (Table 2).<sup>10</sup></p> <p>First, AI must clearly help veterans while minimizing risks. To ensure purpose, the VA will assess patient and clinician needs and design AI that targets meaningful problems to avoid scope creep or feature bloat. For example, adding new features to the AI software after release can clutter and complicate the interface, making it difficult to use. Rigorous testing will confirm that AI meets intent prior to deployment. Second, AI is designed and checked for effectiveness, safety, and reliability. The VA pledges to monitor AI’s impact to ensure it performs as expected without unintended consequences. Algorithms will be stress tested across representative datasets and approval processes will screen for safety issues. Third, AI models are secured from vulnerabilities and misuse. Technical controls will prevent unauthorized access or changes to AI systems. Audits will check for appropriate internal usage per policies. Continual patches and upgrades will maintain security. Fourth, the VA manages AI for fairness, avoiding bias. They will proactively assess datasets and algorithms for potential biases based on protected attributes like race, gender, or age. Biased outputs will be addressed through techniques such as data augmentation, reweighting, and algorithm tweaks. Fifth, transparency explains AI’s role in care. Documentation will detail an AI system’s data sources, methodology, testing, limitations, and integration with clinical workflows. Clinicians and patients will receive education on interpreting AI outputs. Finally, the VA pledges to closely monitor AI systems to sustain trust. The VA will establish oversight processes to quickly identify any declines in reliability or unfair impacts on subgroups. AI models will be retrained as needed based on incoming data patterns.<br/><br/>Each Trustworthy AI Framework principle connects to others in existing frameworks. The purpose principle aligns with human-centric AI focused on benefits. Effectiveness and safety link to technical robustness and risk management principles. Security maps to privacy protection principles. Fairness connects to principles of avoiding bias and discrimination. Transparency corresponds with accountable and explainable AI. Monitoring and accountability tie back to governance principles. Overall, the VA framework aims to guide ethical AI based on context. It offers a model for managing risks and building trust in health care AI.<br/><br/>Combining VA principles with high-reliability safety principles can ensure that AI benefits veterans. The leadership and culture aspects will drive commitment to trustworthy AI practices. Leaders will communicate the importance of responsible AI through words and actions. Culture surveys can assess baseline awareness of AI ethics issues to target education. AI security and fairness will be emphasized as safety critical. The process aspect will institute policies and procedures to uphold AI principles through the project lifecycle. For example, structured testing processes will validate safety. Measurement will collect data on principles like transparency and fairness. Dashboards can track metrics like explainability and biases. A patient-centered approach will incorporate veteran perspectives on AI through participatory design and advisory councils. They can give input on AI explainability and potential biases based on their diverse backgrounds.</p> <h2>Conclusions</h2> <p>Joint principles will lead to successful AI that improves care while proactively managing risks. Involve leaders to stress the necessity of eliminating biases. Build security into the AI development process. Co-design AI transparency features with end users. Closely monitor the impact of AI across safety, fairness, and other principles. Adhering to both Trustworthy AI and high reliability organizations principles will earn veterans’ confidence. Health care organizations like the VA can integrate ethical AI safely via established frameworks. With responsible design and implementation, AI’s potential to enhance care quality, safety, and access can be realized.</p> <p class="isub">Acknowledgments</p> <p> <em>We would like to acknowledge Joshua Mueller, Theo Tiffney, John Zachary, and Gil Alterovitz for their excellent work creating the VA Trustworthy Principles. This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.</em> </p> <p class="isub">Author affiliations </p> <p> <em><sup>a</sup>Veterans Affairs Sunshine Healthcare Network, Tampa, Florida<br/><br/><sup>b</sup>University of South Florida Morsani College of Medicine, Tampa<br/><br/><sup>c</sup>Veterans Affairs National Artificial Intelligence Institute </em> </p> <p class="isub">Author disclosures</p> <p> <em>The authors report no actual or potential conflicts of interest<b> </b>or outside sources of funding with regard to this article.</em> </p> <p class="isub">Disclaimer</p> <p> <em>The opinions expressed herein are those of the authors and do not necessarily reflect those of <i>Federal Practitioner</i>, Frontline Medical Communications Inc., the US Government, or any of its agencies. </em> </p> <p class="isub">References</p> <p class="reference"> 1. Sahni NR, Carrus B. Artificial intelligence in U.S. health care delivery. <i>N Engl J Med</i>. 2023;389(4):348-358. doi:10.1056/NEJMra2204673<br/><br/> 2. Borkowski AA, Jakey CE, Mastorides SM, et al. Applications of ChatGPT and large language models in medicine and health care: benefits and pitfalls. <i>Fed Pract</i>. 2023;40(6):170-173. doi:10.12788/fp.0386<br/><br/> 3. Moyal-Smith R, Margo J, Maloney FL, et al. The patient safety adoption framework: a practical framework to bridge the know-do gap. <i>J Patient Saf</i>. 2023;19(4):243-248. doi:10.1097/PTS.0000000000001118<br/><br/> 4. Isaacks DB, Anderson TM, Moore SC, Patterson W, Govindan S. High reliability organization principles improve VA workplace burnout: the Truman THRIVE2 model. <i>Am J Med Qual</i>. 2021;36(6):422-428. doi:10.1097/01.JMQ.0000735516.35323.97<br/><br/> 5. Sculli GL, Pendley-Louis R, Neily J, et al. A high-reliability organization framework for health care: a multiyear implementation strategy and associated outcomes. <i>J Patient Saf</i>. 2022;18(1):64-70. doi:10.1097/PTS.0000000000000788<br/><br/> 6. National Institute of Standards and Technology. AI risk management framework. Accessed January 2, 2024. https://www.nist.gov/itl/ai-risk-management-framework<br/><br/> 7. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed January 11, 2024. https://www.whitehouse.gov/ostp/ai-bill-of-rights <br/><br/> 8. Executive Office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. <i>Fed Regist.</i> 2020;89(236):78939-78943.<br/><br/> 9. Biden JR. Executive Order on the safe, secure, and trustworthy development and use of artificial intelligence. Published October 30, 2023. Accessed January 11, 2024. https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/<br/><br/>10. US Department of Veterans Affairs. Trustworthy AI. Accessed January 11, 2024. https://department.va.gov/ai/trustworthy/</p> </itemContent> </newsItem> </itemSet></root>
Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Conference Recap Checkbox
Not Conference Recap
Clinical Edge
Display the Slideshow in this Article
Medscape Article
Display survey writer
Reuters content
Disable Inline Native ads
WebMD Article
Article PDF Media

Applications of ChatGPT and Large Language Models in Medicine and Health Care: Benefits and Pitfalls

Article Type
Changed
Tue, 06/13/2023 - 13:34

The development of [artificial intelligence] is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone. It will change the way people work, learn, travel, get health care, and communicate with each other.

Bill Gates 1

As the world emerges from the pandemic and the health care system faces new challenges, technology has become an increasingly important tool for health care professionals (HCPs). One such technology is the large language model (LLM), which has the potential to revolutionize the health care industry. ChatGPT, a popular LLM developed by OpenAI, has gained particular attention in the medical community for its ability to pass the United States Medical Licensing Exam.2 This article will explore the benefits and potential pitfalls of using LLMs like ChatGPT in medicine and health care.

Benefits

HCP burnout is a serious issue that can lead to lower productivity, increased medical errors, and decreased patient satisfaction.3 LLMs can alleviate some administrative burdens on HCPs, allowing them to focus on patient care. By assisting with billing, coding, insurance claims, and organizing schedules, LLMs like ChatGPT can free up time for HCPs to focus on what they do best: providing quality patient care.4 ChatGPT also can assist with diagnoses by providing accurate and reliable information based on a vast amount of clinical data. By learning the relationships between different medical conditions, symptoms, and treatment options, ChatGPT can provide an appropriate differential diagnosis (Figure 1).

figure_1.png
 It can also interpret medical tests, such as imaging studies and laboratory results, improving the accuracy of diagnoses.5 LLMs can also identify potential clinical trial opportunities for patients, leading to improved treatment options and outcomes.6

Imaging medical specialists like radiologists, pathologists, dermatologists, and others can benefit from combining computer vision diagnostics with ChatGPT report creation abilities to streamline the diagnostic workflow and improve diagnostic accuracy (Figure 2).

figure_2.png
 By leveraging the power of LLMs, HCPs can provide faster and more accurate diagnoses, improving patient outcomes. ChatGPT can also help triage patients with urgent issues in the emergency department, reducing the burden on personnel and allowing patients to receive prompt care.7,8

Although using ChatGPT and other LLMs in mental health care has potential benefits, it is essential to note that they are not a substitute for human interaction and personalized care. While ChatGPT can remember information from previous conversations, it cannot provide the same level of personalized, high-quality care that a professional therapist or HCP can. However, by augmenting the work of HCPs, ChatGPT and other LLMs have the potential to make mental health care more accessible and efficient. In addition to providing effective screening in underserved areas, ChatGPT technology may improve the competence of physician assistants and nurse practitioners in delivering mental health care. With the increased incidence of mental health problems in veterans, the pertinence of a ChatGPT-like feature will only increase with time.9

ChatGPT can also be integrated into health care organizations’ websites and mobile apps, providing patients instant access to medical information, self-care advice, symptom checkers, scheduling appointments, and arranging transportation. These features can reduce the burden on health care staff and help patients stay informed and motivated to take an active role in their health. Additionally, health care organizations can use ChatGPT to engage patients by providing reminders for medication renewals and assistance with self-care.4,6,10,11

The potential of artificial intelligence (AI) in the field of medical education and research is immense. According to a study by Gilson and colleagues, ChatGPT has shown promising results as a medical education tool.12 ChatGPT can simulate clinical scenarios, provide real-time feedback, and improve diagnostic skills. It also offers new interactive and personalized learning opportunities for medical students and HCPs.13 ChatGPT can help researchers by streamlining the process of data analysis. It can also administer surveys or questionnaires, facilitate data collection on preferences and experiences, and help in writing scientific publications.14 Nevertheless, to fully unlock the potential of these AI models, additional models that perform checks for factual accuracy, plagiarism, and copyright infringement must be developed.15,16

 

 

AI Bill of Rights

In order to protect the American public, the White House Office of Science and Technology Policy (OSTP) has released a blueprint for an AI Bill of Rights that emphasizes 5 principles to protect the public from the harmful effects of AI models, including safe and effective systems; algorithmic discrimination protection; data privacy; notice and explanation; and human alternatives, considerations, and fallback (Figure 3).17

figure_3.png
 Other trustworthy AI frameworks, such as the White House Executive Order 13960 and the National Institute of Standards and Technology AI Risk Management Framework, are essential to building trust for AI services among HCPs and veteran patients.18,19 To ensure that ChatGPT complies with these principles, especially those related to privacy, security, transparency, and explainability, it is essential to develop trustworthy AI health care products. Methods like calibration and fine-tuning with specialized data sets from the target population and guiding the model’s behavior with reinforcement learning with human feedback (RLHF) may be beneficial. Preserving the patient’s confidentiality is of utmost importance. For example, Microsoft Azure Machine Learning Services, including ChatGPT GPT-4, are Health Insurance Portability and Accountability Act–certified and could enable the creation of such products.20

One of the biggest challenges with LLMs like ChatGPT is the prevalence of inaccurate information or so-called hallucinations.16 These inaccuracies stem from the inability of LLMs to distinguish between real and fake information. To prevent hallucinations, researchers have proposed several methods, including training models on more diverse data, using adversarial training methods, and human-in-the-loop approaches.21 In addition, medicine-specific models like GatorTron, medPaLM, and Almanac were developed, increasing the accuracy of factual results.22-24 Unfortunately, only the GatorTron model is available to the public through the NVIDIA developers’ program.25

Despite these shortcomings, the future of LLMs in health care is promising. Although these models will not replace HCPs, they can help reduce the unnecessary burden on them, prevent burnout, and enable HCPs and patients spend more time together. Establishing an official hospital AI oversight governing body that would promote best practices could ensure the trustworthy implementation of these new technologies.26

Conclusions

The use of ChatGPT and other LLMs in health care has the potential to revolutionize the industry. By assisting HCPs with administrative tasks, improving the accuracy and reliability of diagnoses, and engaging patients, ChatGPT can help health care organizations provide better care to their patients. While LLMs are not a substitute for human interaction and personalized care, they can augment the work of HCPs, making health care more accessible and efficient. As the health care industry continues to evolve, it will be exciting to see how ChatGPT and other LLMs are used to improve patient outcomes and quality of care. In addition, AI technologies like ChatGPT offer enormous potential in medical education and research. To ensure that the benefits outweigh the risks, developing trustworthy AI health care products and establishing oversight governing bodies to ensure their implementation is essential. By doing so, we can help HCPs focus on what matters most, providing high-quality care to patients.

Acknowledgments

This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.

References

1. Bill Gates. The age of AI has begun. March 21, 2023. Accessed May 10, 2023. https://www.gatesnotes.com/the-age-of-ai-has-begun

2. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. Published 2023 Feb 9. doi:10.1371/journal.pdig.0000198

3. Shanafelt TD, West CP, Sinsky C, et al. Changes in burnout and satisfaction with work-life integration in physicians and the general US working population between 2011 and 2020. Mayo Clin Proc. 2022;97(3):491-506. doi:10.1016/j.mayocp.2021.11.021

4. Goodman RS, Patrinely JR Jr, Osterman T, Wheless L, Johnson DB. On the cusp: considering the impact of artificial intelligence language models in healthcare. Med. 2023;4(3):139-140. doi:10.1016/j.medj.2023.02.008

5. Will ChatGPT transform healthcare? Nat Med. 2023;29(3):505-506. doi:10.1038/s41591-023-02289-5

6. Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. 2023;7(2):pkad010. doi:10.1093/jncics/pkad010

7. Babar Z, van Laarhoven T, Zanzotto FM, Marchiori E. Evaluating diagnostic content of AI-generated radiology reports of chest X-rays. Artif Intell Med. 2021;116:102075. doi:10.1016/j.artmed.2021.102075

8. Lecler A, Duron L, Soyer P. Revolutionizing radiology with GPT-based models: current applications, future possibilities and limitations of ChatGPT. Diagn Interv Imaging. 2023;S2211-5684(23)00027-X. doi:10.1016/j.diii.2023.02.003

9. Germain JM. Is ChatGPT smart enough to practice mental health therapy? March 23, 2023. Accessed May 11, 2023. https://www.technewsworld.com/story/is-chatgpt-smart-enough-to-practice-mental-health-therapy-178064.html

10. Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. J Med Syst. 2023;47(1):33. Published 2023 Mar 4. doi:10.1007/s10916-023-01925-4

11. Jungwirth D, Haluza D. Artificial intelligence and public health: an exploratory study. Int J Environ Res Public Health. 2023;20(5):4541. Published 2023 Mar 3. doi:10.3390/ijerph20054541

12. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9:e45312. Published 2023 Feb 8. doi:10.2196/45312

13. Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ. 2023;9:e46885. Published 2023 Mar 6. doi:10.2196/46885

14. Macdonald C, Adeloye D, Sheikh A, Rudan I. Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis. J Glob Health. 2023;13:01003. Published 2023 Feb 17. doi:10.7189/jogh.13.01003

15. Masters K. Ethical use of artificial intelligence in health professions education: AMEE Guide No.158. Med Teach. 2023;1-11. doi:10.1080/0142159X.2023.2186203

16. Smith CS. Hallucinations could blunt ChatGPT’s success. IEEE Spectrum. March 13, 2023. Accessed May 11, 2023. https://spectrum.ieee.org/ai-hallucination

17. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed May 11, 2023. https://www.whitehouse.gov/ostp/ai-bill-of-rights

18. Executive office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. Fed Regist. 2020;89(236):78939-78943.

19. US Department of Commerce, National institute of Standards and Technology. Artificial Intelligence Risk Management Framework (AI RMF 1.0). Published January 2023. doi:10.6028/NIST.AI.100-1

20. Microsoft. Azure Cognitive Search—Cloud Search Service. Accessed May 11, 2023. https://azure.microsoft.com/en-us/products/search

21. Aiyappa R, An J, Kwak H, Ahn YY. Can we trust the evaluation on ChatGPT? March 22, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.12767v1

22. Yang X, Chen A, Pournejatian N, et al. GatorTron: a large clinical language model to unlock patient information from unstructured electronic health records. Updated December 16, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2203.03540v3

23. Singhal K, Azizi S, Tu T, et al. Large language models encode clinical knowledge. December 26, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2212.13138v1

24. Zakka C, Chaurasia A, Shad R, Hiesinger W. Almanac: knowledge-grounded language models for clinical medicine. March 1, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.01229v1

25. NVIDIA. GatorTron-OG. Accessed May 11, 2023. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og

26. Borkowski AA, Jakey CE, Thomas LB, Viswanadhan N, Mastorides SM. Establishing a hospital artificial intelligence committee to improve patient care. Fed Pract. 2022;39(8):334-336. doi:10.12788/fp.0299

Article PDF
Author and Disclosure Information

Andrew A. Borkowski, MDa,b,c; Colleen E. Jakey, MDa,b; Stephen M. Mastorides, MDa,b; Ana L. Kraus, MDa,b; Gitanjali Vidyarthi, MDa,b; Narayan Viswanadhan, MDa,b; Jose L. Lezama, MDa,b

Correspondence:  Andrew Borkowski  (andrew.borkowski@va.gov)

aJames A. Haley Veterans’ Hospital, Tampa, Florida

bUniversity of South Florida Morsani College of Medicine, Tampa

cNational Artificial Intelligence Institute, Washington, DC

Author disclosures

The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.

Disclaimer

The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the U.S. Government, or any of its agencies.

Issue
Federal Practitioner - 40(6)a
Publications
Topics
Page Number
170-173
Sections
Author and Disclosure Information

Andrew A. Borkowski, MDa,b,c; Colleen E. Jakey, MDa,b; Stephen M. Mastorides, MDa,b; Ana L. Kraus, MDa,b; Gitanjali Vidyarthi, MDa,b; Narayan Viswanadhan, MDa,b; Jose L. Lezama, MDa,b

Correspondence:  Andrew Borkowski  (andrew.borkowski@va.gov)

aJames A. Haley Veterans’ Hospital, Tampa, Florida

bUniversity of South Florida Morsani College of Medicine, Tampa

cNational Artificial Intelligence Institute, Washington, DC

Author disclosures

The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.

Disclaimer

The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the U.S. Government, or any of its agencies.

Author and Disclosure Information

Andrew A. Borkowski, MDa,b,c; Colleen E. Jakey, MDa,b; Stephen M. Mastorides, MDa,b; Ana L. Kraus, MDa,b; Gitanjali Vidyarthi, MDa,b; Narayan Viswanadhan, MDa,b; Jose L. Lezama, MDa,b

Correspondence:  Andrew Borkowski  (andrew.borkowski@va.gov)

aJames A. Haley Veterans’ Hospital, Tampa, Florida

bUniversity of South Florida Morsani College of Medicine, Tampa

cNational Artificial Intelligence Institute, Washington, DC

Author disclosures

The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.

Disclaimer

The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the U.S. Government, or any of its agencies.

Article PDF
Article PDF

The development of [artificial intelligence] is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone. It will change the way people work, learn, travel, get health care, and communicate with each other.

Bill Gates 1

As the world emerges from the pandemic and the health care system faces new challenges, technology has become an increasingly important tool for health care professionals (HCPs). One such technology is the large language model (LLM), which has the potential to revolutionize the health care industry. ChatGPT, a popular LLM developed by OpenAI, has gained particular attention in the medical community for its ability to pass the United States Medical Licensing Exam.2 This article will explore the benefits and potential pitfalls of using LLMs like ChatGPT in medicine and health care.

Benefits

HCP burnout is a serious issue that can lead to lower productivity, increased medical errors, and decreased patient satisfaction.3 LLMs can alleviate some administrative burdens on HCPs, allowing them to focus on patient care. By assisting with billing, coding, insurance claims, and organizing schedules, LLMs like ChatGPT can free up time for HCPs to focus on what they do best: providing quality patient care.4 ChatGPT also can assist with diagnoses by providing accurate and reliable information based on a vast amount of clinical data. By learning the relationships between different medical conditions, symptoms, and treatment options, ChatGPT can provide an appropriate differential diagnosis (Figure 1).

figure_1.png
 It can also interpret medical tests, such as imaging studies and laboratory results, improving the accuracy of diagnoses.5 LLMs can also identify potential clinical trial opportunities for patients, leading to improved treatment options and outcomes.6

Imaging medical specialists like radiologists, pathologists, dermatologists, and others can benefit from combining computer vision diagnostics with ChatGPT report creation abilities to streamline the diagnostic workflow and improve diagnostic accuracy (Figure 2).

figure_2.png
 By leveraging the power of LLMs, HCPs can provide faster and more accurate diagnoses, improving patient outcomes. ChatGPT can also help triage patients with urgent issues in the emergency department, reducing the burden on personnel and allowing patients to receive prompt care.7,8

Although using ChatGPT and other LLMs in mental health care has potential benefits, it is essential to note that they are not a substitute for human interaction and personalized care. While ChatGPT can remember information from previous conversations, it cannot provide the same level of personalized, high-quality care that a professional therapist or HCP can. However, by augmenting the work of HCPs, ChatGPT and other LLMs have the potential to make mental health care more accessible and efficient. In addition to providing effective screening in underserved areas, ChatGPT technology may improve the competence of physician assistants and nurse practitioners in delivering mental health care. With the increased incidence of mental health problems in veterans, the pertinence of a ChatGPT-like feature will only increase with time.9

ChatGPT can also be integrated into health care organizations’ websites and mobile apps, providing patients instant access to medical information, self-care advice, symptom checkers, scheduling appointments, and arranging transportation. These features can reduce the burden on health care staff and help patients stay informed and motivated to take an active role in their health. Additionally, health care organizations can use ChatGPT to engage patients by providing reminders for medication renewals and assistance with self-care.4,6,10,11

The potential of artificial intelligence (AI) in the field of medical education and research is immense. According to a study by Gilson and colleagues, ChatGPT has shown promising results as a medical education tool.12 ChatGPT can simulate clinical scenarios, provide real-time feedback, and improve diagnostic skills. It also offers new interactive and personalized learning opportunities for medical students and HCPs.13 ChatGPT can help researchers by streamlining the process of data analysis. It can also administer surveys or questionnaires, facilitate data collection on preferences and experiences, and help in writing scientific publications.14 Nevertheless, to fully unlock the potential of these AI models, additional models that perform checks for factual accuracy, plagiarism, and copyright infringement must be developed.15,16

 

 

AI Bill of Rights

In order to protect the American public, the White House Office of Science and Technology Policy (OSTP) has released a blueprint for an AI Bill of Rights that emphasizes 5 principles to protect the public from the harmful effects of AI models, including safe and effective systems; algorithmic discrimination protection; data privacy; notice and explanation; and human alternatives, considerations, and fallback (Figure 3).17

figure_3.png
 Other trustworthy AI frameworks, such as the White House Executive Order 13960 and the National Institute of Standards and Technology AI Risk Management Framework, are essential to building trust for AI services among HCPs and veteran patients.18,19 To ensure that ChatGPT complies with these principles, especially those related to privacy, security, transparency, and explainability, it is essential to develop trustworthy AI health care products. Methods like calibration and fine-tuning with specialized data sets from the target population and guiding the model’s behavior with reinforcement learning with human feedback (RLHF) may be beneficial. Preserving the patient’s confidentiality is of utmost importance. For example, Microsoft Azure Machine Learning Services, including ChatGPT GPT-4, are Health Insurance Portability and Accountability Act–certified and could enable the creation of such products.20

One of the biggest challenges with LLMs like ChatGPT is the prevalence of inaccurate information or so-called hallucinations.16 These inaccuracies stem from the inability of LLMs to distinguish between real and fake information. To prevent hallucinations, researchers have proposed several methods, including training models on more diverse data, using adversarial training methods, and human-in-the-loop approaches.21 In addition, medicine-specific models like GatorTron, medPaLM, and Almanac were developed, increasing the accuracy of factual results.22-24 Unfortunately, only the GatorTron model is available to the public through the NVIDIA developers’ program.25

Despite these shortcomings, the future of LLMs in health care is promising. Although these models will not replace HCPs, they can help reduce the unnecessary burden on them, prevent burnout, and enable HCPs and patients spend more time together. Establishing an official hospital AI oversight governing body that would promote best practices could ensure the trustworthy implementation of these new technologies.26

Conclusions

The use of ChatGPT and other LLMs in health care has the potential to revolutionize the industry. By assisting HCPs with administrative tasks, improving the accuracy and reliability of diagnoses, and engaging patients, ChatGPT can help health care organizations provide better care to their patients. While LLMs are not a substitute for human interaction and personalized care, they can augment the work of HCPs, making health care more accessible and efficient. As the health care industry continues to evolve, it will be exciting to see how ChatGPT and other LLMs are used to improve patient outcomes and quality of care. In addition, AI technologies like ChatGPT offer enormous potential in medical education and research. To ensure that the benefits outweigh the risks, developing trustworthy AI health care products and establishing oversight governing bodies to ensure their implementation is essential. By doing so, we can help HCPs focus on what matters most, providing high-quality care to patients.

Acknowledgments

This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.

The development of [artificial intelligence] is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone. It will change the way people work, learn, travel, get health care, and communicate with each other.

Bill Gates 1

As the world emerges from the pandemic and the health care system faces new challenges, technology has become an increasingly important tool for health care professionals (HCPs). One such technology is the large language model (LLM), which has the potential to revolutionize the health care industry. ChatGPT, a popular LLM developed by OpenAI, has gained particular attention in the medical community for its ability to pass the United States Medical Licensing Exam.2 This article will explore the benefits and potential pitfalls of using LLMs like ChatGPT in medicine and health care.

Benefits

HCP burnout is a serious issue that can lead to lower productivity, increased medical errors, and decreased patient satisfaction.3 LLMs can alleviate some administrative burdens on HCPs, allowing them to focus on patient care. By assisting with billing, coding, insurance claims, and organizing schedules, LLMs like ChatGPT can free up time for HCPs to focus on what they do best: providing quality patient care.4 ChatGPT also can assist with diagnoses by providing accurate and reliable information based on a vast amount of clinical data. By learning the relationships between different medical conditions, symptoms, and treatment options, ChatGPT can provide an appropriate differential diagnosis (Figure 1).

figure_1.png
 It can also interpret medical tests, such as imaging studies and laboratory results, improving the accuracy of diagnoses.5 LLMs can also identify potential clinical trial opportunities for patients, leading to improved treatment options and outcomes.6

Imaging medical specialists like radiologists, pathologists, dermatologists, and others can benefit from combining computer vision diagnostics with ChatGPT report creation abilities to streamline the diagnostic workflow and improve diagnostic accuracy (Figure 2).

figure_2.png
 By leveraging the power of LLMs, HCPs can provide faster and more accurate diagnoses, improving patient outcomes. ChatGPT can also help triage patients with urgent issues in the emergency department, reducing the burden on personnel and allowing patients to receive prompt care.7,8

Although using ChatGPT and other LLMs in mental health care has potential benefits, it is essential to note that they are not a substitute for human interaction and personalized care. While ChatGPT can remember information from previous conversations, it cannot provide the same level of personalized, high-quality care that a professional therapist or HCP can. However, by augmenting the work of HCPs, ChatGPT and other LLMs have the potential to make mental health care more accessible and efficient. In addition to providing effective screening in underserved areas, ChatGPT technology may improve the competence of physician assistants and nurse practitioners in delivering mental health care. With the increased incidence of mental health problems in veterans, the pertinence of a ChatGPT-like feature will only increase with time.9

ChatGPT can also be integrated into health care organizations’ websites and mobile apps, providing patients instant access to medical information, self-care advice, symptom checkers, scheduling appointments, and arranging transportation. These features can reduce the burden on health care staff and help patients stay informed and motivated to take an active role in their health. Additionally, health care organizations can use ChatGPT to engage patients by providing reminders for medication renewals and assistance with self-care.4,6,10,11

The potential of artificial intelligence (AI) in the field of medical education and research is immense. According to a study by Gilson and colleagues, ChatGPT has shown promising results as a medical education tool.12 ChatGPT can simulate clinical scenarios, provide real-time feedback, and improve diagnostic skills. It also offers new interactive and personalized learning opportunities for medical students and HCPs.13 ChatGPT can help researchers by streamlining the process of data analysis. It can also administer surveys or questionnaires, facilitate data collection on preferences and experiences, and help in writing scientific publications.14 Nevertheless, to fully unlock the potential of these AI models, additional models that perform checks for factual accuracy, plagiarism, and copyright infringement must be developed.15,16

 

 

AI Bill of Rights

In order to protect the American public, the White House Office of Science and Technology Policy (OSTP) has released a blueprint for an AI Bill of Rights that emphasizes 5 principles to protect the public from the harmful effects of AI models, including safe and effective systems; algorithmic discrimination protection; data privacy; notice and explanation; and human alternatives, considerations, and fallback (Figure 3).17

figure_3.png
 Other trustworthy AI frameworks, such as the White House Executive Order 13960 and the National Institute of Standards and Technology AI Risk Management Framework, are essential to building trust for AI services among HCPs and veteran patients.18,19 To ensure that ChatGPT complies with these principles, especially those related to privacy, security, transparency, and explainability, it is essential to develop trustworthy AI health care products. Methods like calibration and fine-tuning with specialized data sets from the target population and guiding the model’s behavior with reinforcement learning with human feedback (RLHF) may be beneficial. Preserving the patient’s confidentiality is of utmost importance. For example, Microsoft Azure Machine Learning Services, including ChatGPT GPT-4, are Health Insurance Portability and Accountability Act–certified and could enable the creation of such products.20

One of the biggest challenges with LLMs like ChatGPT is the prevalence of inaccurate information or so-called hallucinations.16 These inaccuracies stem from the inability of LLMs to distinguish between real and fake information. To prevent hallucinations, researchers have proposed several methods, including training models on more diverse data, using adversarial training methods, and human-in-the-loop approaches.21 In addition, medicine-specific models like GatorTron, medPaLM, and Almanac were developed, increasing the accuracy of factual results.22-24 Unfortunately, only the GatorTron model is available to the public through the NVIDIA developers’ program.25

Despite these shortcomings, the future of LLMs in health care is promising. Although these models will not replace HCPs, they can help reduce the unnecessary burden on them, prevent burnout, and enable HCPs and patients spend more time together. Establishing an official hospital AI oversight governing body that would promote best practices could ensure the trustworthy implementation of these new technologies.26

Conclusions

The use of ChatGPT and other LLMs in health care has the potential to revolutionize the industry. By assisting HCPs with administrative tasks, improving the accuracy and reliability of diagnoses, and engaging patients, ChatGPT can help health care organizations provide better care to their patients. While LLMs are not a substitute for human interaction and personalized care, they can augment the work of HCPs, making health care more accessible and efficient. As the health care industry continues to evolve, it will be exciting to see how ChatGPT and other LLMs are used to improve patient outcomes and quality of care. In addition, AI technologies like ChatGPT offer enormous potential in medical education and research. To ensure that the benefits outweigh the risks, developing trustworthy AI health care products and establishing oversight governing bodies to ensure their implementation is essential. By doing so, we can help HCPs focus on what matters most, providing high-quality care to patients.

Acknowledgments

This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.

References

1. Bill Gates. The age of AI has begun. March 21, 2023. Accessed May 10, 2023. https://www.gatesnotes.com/the-age-of-ai-has-begun

2. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. Published 2023 Feb 9. doi:10.1371/journal.pdig.0000198

3. Shanafelt TD, West CP, Sinsky C, et al. Changes in burnout and satisfaction with work-life integration in physicians and the general US working population between 2011 and 2020. Mayo Clin Proc. 2022;97(3):491-506. doi:10.1016/j.mayocp.2021.11.021

4. Goodman RS, Patrinely JR Jr, Osterman T, Wheless L, Johnson DB. On the cusp: considering the impact of artificial intelligence language models in healthcare. Med. 2023;4(3):139-140. doi:10.1016/j.medj.2023.02.008

5. Will ChatGPT transform healthcare? Nat Med. 2023;29(3):505-506. doi:10.1038/s41591-023-02289-5

6. Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. 2023;7(2):pkad010. doi:10.1093/jncics/pkad010

7. Babar Z, van Laarhoven T, Zanzotto FM, Marchiori E. Evaluating diagnostic content of AI-generated radiology reports of chest X-rays. Artif Intell Med. 2021;116:102075. doi:10.1016/j.artmed.2021.102075

8. Lecler A, Duron L, Soyer P. Revolutionizing radiology with GPT-based models: current applications, future possibilities and limitations of ChatGPT. Diagn Interv Imaging. 2023;S2211-5684(23)00027-X. doi:10.1016/j.diii.2023.02.003

9. Germain JM. Is ChatGPT smart enough to practice mental health therapy? March 23, 2023. Accessed May 11, 2023. https://www.technewsworld.com/story/is-chatgpt-smart-enough-to-practice-mental-health-therapy-178064.html

10. Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. J Med Syst. 2023;47(1):33. Published 2023 Mar 4. doi:10.1007/s10916-023-01925-4

11. Jungwirth D, Haluza D. Artificial intelligence and public health: an exploratory study. Int J Environ Res Public Health. 2023;20(5):4541. Published 2023 Mar 3. doi:10.3390/ijerph20054541

12. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9:e45312. Published 2023 Feb 8. doi:10.2196/45312

13. Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ. 2023;9:e46885. Published 2023 Mar 6. doi:10.2196/46885

14. Macdonald C, Adeloye D, Sheikh A, Rudan I. Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis. J Glob Health. 2023;13:01003. Published 2023 Feb 17. doi:10.7189/jogh.13.01003

15. Masters K. Ethical use of artificial intelligence in health professions education: AMEE Guide No.158. Med Teach. 2023;1-11. doi:10.1080/0142159X.2023.2186203

16. Smith CS. Hallucinations could blunt ChatGPT’s success. IEEE Spectrum. March 13, 2023. Accessed May 11, 2023. https://spectrum.ieee.org/ai-hallucination

17. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed May 11, 2023. https://www.whitehouse.gov/ostp/ai-bill-of-rights

18. Executive office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. Fed Regist. 2020;89(236):78939-78943.

19. US Department of Commerce, National institute of Standards and Technology. Artificial Intelligence Risk Management Framework (AI RMF 1.0). Published January 2023. doi:10.6028/NIST.AI.100-1

20. Microsoft. Azure Cognitive Search—Cloud Search Service. Accessed May 11, 2023. https://azure.microsoft.com/en-us/products/search

21. Aiyappa R, An J, Kwak H, Ahn YY. Can we trust the evaluation on ChatGPT? March 22, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.12767v1

22. Yang X, Chen A, Pournejatian N, et al. GatorTron: a large clinical language model to unlock patient information from unstructured electronic health records. Updated December 16, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2203.03540v3

23. Singhal K, Azizi S, Tu T, et al. Large language models encode clinical knowledge. December 26, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2212.13138v1

24. Zakka C, Chaurasia A, Shad R, Hiesinger W. Almanac: knowledge-grounded language models for clinical medicine. March 1, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.01229v1

25. NVIDIA. GatorTron-OG. Accessed May 11, 2023. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og

26. Borkowski AA, Jakey CE, Thomas LB, Viswanadhan N, Mastorides SM. Establishing a hospital artificial intelligence committee to improve patient care. Fed Pract. 2022;39(8):334-336. doi:10.12788/fp.0299

References

1. Bill Gates. The age of AI has begun. March 21, 2023. Accessed May 10, 2023. https://www.gatesnotes.com/the-age-of-ai-has-begun

2. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. Published 2023 Feb 9. doi:10.1371/journal.pdig.0000198

3. Shanafelt TD, West CP, Sinsky C, et al. Changes in burnout and satisfaction with work-life integration in physicians and the general US working population between 2011 and 2020. Mayo Clin Proc. 2022;97(3):491-506. doi:10.1016/j.mayocp.2021.11.021

4. Goodman RS, Patrinely JR Jr, Osterman T, Wheless L, Johnson DB. On the cusp: considering the impact of artificial intelligence language models in healthcare. Med. 2023;4(3):139-140. doi:10.1016/j.medj.2023.02.008

5. Will ChatGPT transform healthcare? Nat Med. 2023;29(3):505-506. doi:10.1038/s41591-023-02289-5

6. Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. JNCI Cancer Spectr. 2023;7(2):pkad010. doi:10.1093/jncics/pkad010

7. Babar Z, van Laarhoven T, Zanzotto FM, Marchiori E. Evaluating diagnostic content of AI-generated radiology reports of chest X-rays. Artif Intell Med. 2021;116:102075. doi:10.1016/j.artmed.2021.102075

8. Lecler A, Duron L, Soyer P. Revolutionizing radiology with GPT-based models: current applications, future possibilities and limitations of ChatGPT. Diagn Interv Imaging. 2023;S2211-5684(23)00027-X. doi:10.1016/j.diii.2023.02.003

9. Germain JM. Is ChatGPT smart enough to practice mental health therapy? March 23, 2023. Accessed May 11, 2023. https://www.technewsworld.com/story/is-chatgpt-smart-enough-to-practice-mental-health-therapy-178064.html

10. Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. J Med Syst. 2023;47(1):33. Published 2023 Mar 4. doi:10.1007/s10916-023-01925-4

11. Jungwirth D, Haluza D. Artificial intelligence and public health: an exploratory study. Int J Environ Res Public Health. 2023;20(5):4541. Published 2023 Mar 3. doi:10.3390/ijerph20054541

12. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ. 2023;9:e45312. Published 2023 Feb 8. doi:10.2196/45312

13. Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ. 2023;9:e46885. Published 2023 Mar 6. doi:10.2196/46885

14. Macdonald C, Adeloye D, Sheikh A, Rudan I. Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis. J Glob Health. 2023;13:01003. Published 2023 Feb 17. doi:10.7189/jogh.13.01003

15. Masters K. Ethical use of artificial intelligence in health professions education: AMEE Guide No.158. Med Teach. 2023;1-11. doi:10.1080/0142159X.2023.2186203

16. Smith CS. Hallucinations could blunt ChatGPT’s success. IEEE Spectrum. March 13, 2023. Accessed May 11, 2023. https://spectrum.ieee.org/ai-hallucination

17. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed May 11, 2023. https://www.whitehouse.gov/ostp/ai-bill-of-rights

18. Executive office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. Fed Regist. 2020;89(236):78939-78943.

19. US Department of Commerce, National institute of Standards and Technology. Artificial Intelligence Risk Management Framework (AI RMF 1.0). Published January 2023. doi:10.6028/NIST.AI.100-1

20. Microsoft. Azure Cognitive Search—Cloud Search Service. Accessed May 11, 2023. https://azure.microsoft.com/en-us/products/search

21. Aiyappa R, An J, Kwak H, Ahn YY. Can we trust the evaluation on ChatGPT? March 22, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.12767v1

22. Yang X, Chen A, Pournejatian N, et al. GatorTron: a large clinical language model to unlock patient information from unstructured electronic health records. Updated December 16, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2203.03540v3

23. Singhal K, Azizi S, Tu T, et al. Large language models encode clinical knowledge. December 26, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2212.13138v1

24. Zakka C, Chaurasia A, Shad R, Hiesinger W. Almanac: knowledge-grounded language models for clinical medicine. March 1, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.01229v1

25. NVIDIA. GatorTron-OG. Accessed May 11, 2023. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og

26. Borkowski AA, Jakey CE, Thomas LB, Viswanadhan N, Mastorides SM. Establishing a hospital artificial intelligence committee to improve patient care. Fed Pract. 2022;39(8):334-336. doi:10.12788/fp.0299

Issue
Federal Practitioner - 40(6)a
Issue
Federal Practitioner - 40(6)a
Page Number
170-173
Page Number
170-173
Publications
Publications
Topics
Article Type
Sections
Teambase XML
<?xml version="1.0" encoding="UTF-8"?>
<!--$RCSfile: InCopy_agile.xsl,v $ $Revision: 1.35 $-->
<!--$RCSfile: drupal.xsl,v $ $Revision: 1.7 $-->
<root generator="drupal.xsl" gversion="1.7"> <header> <fileName>0623 FED ChatGPT</fileName> <TBEID>0C02D05E.SIG</TBEID> <TBUniqueIdentifier>NJ_0C02D05E</TBUniqueIdentifier> <newsOrJournal>Journal</newsOrJournal> <publisherName>Frontline Medical Communications Inc.</publisherName> <storyname/> <articleType>1</articleType> <TBLocation>Copyfitting-FED</TBLocation> <QCDate/> <firstPublished>20230613T113642</firstPublished> <LastPublished>20230613T113642</LastPublished> <pubStatus qcode="stat:"/> <embargoDate/> <killDate/> <CMSDate>20230613T113642</CMSDate> <articleSource/> <facebookInfo/> <meetingNumber/> <byline/> <bylineText>Andrew A. Borkowski, MDa,b,c; Colleen E. Jakey, MD,a,b; Stephen M. Mastorides, MDa,b; Ana L. Kraus, MDa,b; Gitanjali Vidyarthi, MDa,b; Narayan Viswanadhan, MDa,b; Jose L. Lezama, MDa,b</bylineText> <bylineFull/> <bylineTitleText/> <USOrGlobal/> <wireDocType/> <newsDocType/> <journalDocType/> <linkLabel/> <pageRange/> <citation/> <quizID/> <indexIssueDate/> <itemClass qcode="ninat:text"/> <provider qcode="provider:"> <name/> <rightsInfo> <copyrightHolder> <name/> </copyrightHolder> <copyrightNotice/> </rightsInfo> </provider> <abstract/> <metaDescription>As the world emerges from the pandemic and the health care system faces new challenges, technology has become an increasingly important tool for health care pro</metaDescription> <articlePDF/> <teaserImage/> <title>Applications of ChatGPT and Large Language Models in Medicine and Health Care: Benefits and Pitfalls</title> <deck/> <eyebrow>Commentary</eyebrow> <disclaimer/> <AuthorList/> <articleURL/> <doi/> <pubMedID/> <publishXMLStatus/> <publishXMLVersion>1</publishXMLVersion> <useEISSN>0</useEISSN> <urgency/> <pubPubdateYear>2023</pubPubdateYear> <pubPubdateMonth>June</pubPubdateMonth> <pubPubdateDay/> <pubVolume>40</pubVolume> <pubNumber>6</pubNumber> <wireChannels/> <primaryCMSID/> <CMSIDs/> <keywords/> <seeAlsos/> <publications_g> <publicationData> <publicationCode>fed</publicationCode> <pubIssueName/> <pubArticleType/> <pubTopics/> <pubCategories/> <pubSections/> </publicationData> </publications_g> <publications> <term canonical="true">16</term> </publications> <sections> <term canonical="true">52</term> </sections> <topics> <term canonical="true">27442</term> </topics> <links/> </header> <itemSet> <newsItem> <itemMeta> <itemRole>Main</itemRole> <itemClass>text</itemClass> <title>Applications of ChatGPT and Large Language Models in Medicine and Health Care: Benefits and Pitfalls</title> <deck/> </itemMeta> <itemContent> <p class="abstract"><b>Background:</b> The use of large language models like ChatGPT is becoming increasingly popular in health care settings. These artificial intelligence models are trained on vast amounts of data and can be used for various tasks, such as language translation, summarization, and answering questions. <br/><br/><b>Observations: </b>Large language models have the potential to revolutionize the industry by assisting medical professionals with administrative tasks, improving diagnostic accuracy, and engaging patients. However, pitfalls exist, such as its inability to distinguish between real and fake information and the need to comply with privacy, security, and transparency principles. <br/><br/><b>Conclusions:</b> Careful consideration is needed to ensure the responsible and ethical use of large language models in medicine and health care.</p> <p class="Normal"> <i>The development of [artificial intelligence] is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone. It will change the way people work, learn, travel, get health care, and communicate with each other.</i> </p> <p class="Normal"> Bill Gates <sup>1</sup> </p> <p><span class="Drop">A</span>s the world emerges from the pandemic and the health care system faces new challenges, technology has become an increasingly important tool for health care professionals (HCPs). One such technology is the large language model (LLM), which has the potential to revolutionize the health care industry. ChatGPT, a popular LLM developed by OpenAI, has gained particular attention in the medical community for its ability to pass the United States Medical Licensing Exam.<sup>2</sup> This article will explore the benefits and potential pitfalls of using LLMs like ChatGPT in medicine and health care. </p> <h2>Benefits</h2> <p>HCP burnout is a serious issue that can lead to lower productivity, increased medical errors, and decreased patient satisfaction.<sup>3</sup> LLMs can alleviate some administrative burdens on HCPs, allowing them to focus on patient care. By assisting with billing, coding, insurance claims, and organizing schedules, LLMs like ChatGPT can free up time for HCPs to focus on what they do best: providing quality patient care.<sup>4</sup> ChatGPT also can assist with diagnoses by providing accurate and reliable information based on a vast amount of clinical data. By learning the relationships between different medical conditions, symptoms, and treatment options, ChatGPT can provide an appropriate differential diagnosis (Figure 1). It can also interpret medical tests, such as imaging studies and laboratory results, improving the accuracy of diagnoses.<sup>5</sup> LLMs can also identify potential clinical trial opportunities for patients, leading to improved treatment options and outcomes.<sup>6</sup></p> <p>Imaging medical specialists like radiologists, pathologists, dermatologists, and others can benefit from combining computer vision diagnostics with ChatGPT report creation abilities to streamline the diagnostic workflow and improve diagnostic accuracy (Figure 2). By leveraging the power of LLMs, HCPs can provide faster and more accurate diagnoses, improving patient outcomes. ChatGPT can also help triage patients with urgent issues in the emergency department, reducing the burden on personnel and allowing patients to receive prompt care.<sup>7,8<br/><br/></sup>Although using ChatGPT and other LLMs in mental health care has potential benefits, it is essential to note that they are not a substitute for human interaction and personalized care. While ChatGPT can remember information from previous conversations, it cannot provide the same level of personalized, high-quality care that a professional therapist or HCP can. However, by augmenting the work of HCPs, ChatGPT and other LLMs have the potential to make mental health care more accessible and efficient. In addition to providing effective screening in underserved areas, ChatGPT technology may improve the competence of physician assistants and nurse practitioners in delivering mental health care. With the increased incidence of mental health problems in veterans, the pertinence of a ChatGPT-like feature will only increase with time.<sup>9<br/><br/></sup>ChatGPT can also be integrated into health care organizations’ websites and mobile apps, providing patients instant access to medical information, self-care advice, symptom checkers, scheduling appointments, and arranging transportation. These features can reduce the burden on health care staff and help patients stay informed and motivated to take an active role in their health. Additionally, health care organizations can use ChatGPT to engage patients by providing reminders for medication renewals and assistance with self-care.<sup>4,6,10,11<br/><br/></sup>The potential of artificial intelligence (AI) in the field of medical education and research is immense. According to a study by Gilson and colleagues, ChatGPT has shown promising results as a medical education tool.<sup>12</sup> ChatGPT can simulate clinical scenarios, provide real-time feedback, and improve diagnostic skills. It also offers new interactive and personalized learning opportunities for medical students and HCPs.<sup>13</sup> ChatGPT can help researchers by streamlining the process of data analysis. It can also administer surveys or questionnaires, facilitate data collection on preferences and experiences, and help in writing scientific publications.<sup>14</sup> Nevertheless, to fully unlock the potential of these AI models, additional models that perform checks for factual accuracy, plagiarism, and copyright infringement must be developed.<sup>15,16</sup></p> <h2>AI Bill of Rights</h2> <p>In order to protect the American public, the White House Office of Science and Technology Policy (OSTP) has released a blueprint for an AI Bill of Rights that emphasizes 5 principles to protect the public from the harmful effects of AI models, including safe and effective systems; algorithmic discrimination protection; data privacy; notice and explanation; and human alternatives, considerations, and fallback (Figure 3).<sup>17</sup> Other trustworthy AI frameworks, such as the White House Executive Order 13960 and the National Institute of Standards and Technology AI Risk Management Framework, are essential to building trust for AI services among HCPs and veteran patients.<sup>18,19</sup> To ensure that ChatGPT complies with these principles, especially those related to privacy, security, transparency, and explainability, it is essential to develop trustworthy AI health care products. Methods like calibration and fine-tuning with specialized data sets from the target population and guiding the model’s behavior with reinforcement learning with human feedback (RLHF) may be beneficial. Preserving the patient’s confidentiality is of utmost importance. For example, Microsoft Azure Machine Learning Services, including ChatGPT GPT-4, are Health Insurance Portability and Accountability Act–certified and could enable the creation of such products.<sup>20</sup></p> <p>One of the biggest challenges with LLMs like ChatGPT is the prevalence of inaccurate information or so-called hallucinations.<sup>16</sup> These inaccuracies stem from the inability of LLMs to distinguish between real and fake information. To prevent hallucinations, researchers have proposed several methods, including training models on more diverse data, using adversarial training methods, and human-in-the-loop approaches.<sup>21</sup> In addition, medicine-specific models like GatorTron, medPaLM, and Almanac were developed, increasing the accuracy of factual results.<sup>22-24</sup> Unfortunately, only the GatorTron model is available to the public through the NVIDIA developers’ program.<sup>25<br/><br/></sup>Despite these shortcomings, the future of LLMs in health care is promising. Although these models will not replace HCPs, they can help reduce the unnecessary burden on them, prevent burnout, and enable HCPs and patients spend more time together. Establishing an official hospital AI oversight governing body that would promote best practices could ensure the trustworthy implementation of these new technologies.<sup>26</sup></p> <h2>Conclusions</h2> <p>The use of ChatGPT and other LLMs in health care has the potential to revolutionize the industry. By assisting HCPs with administrative tasks, improving the accuracy and reliability of diagnoses, and engaging patients, ChatGPT can help health care organizations provide better care to their patients. While LLMs are not a substitute for human interaction and personalized care, they can augment the work of HCPs, making health care more accessible and efficient. As the health care industry continues to evolve, it will be exciting to see how ChatGPT and other LLMs are used to improve patient outcomes and quality of care. In addition, AI technologies like ChatGPT offer enormous potential in medical education and research. To ensure that the benefits outweigh the risks, developing trustworthy AI health care products and establishing oversight governing bodies to ensure their implementation is essential. By doing so, we can help HCPs focus on what matters most, providing high-quality care to patients.</p> <h3> Acknowledgments </h3> <p> <em>This material is the result of work supported by resources and the use of facilities at the James A. Haley Veterans’ Hospital.</em> </p> <h3> Author affiliations </h3> <p> <em><sup>a</sup>James A. Haley Veterans’ Hospital, Tampa, Florida<br/><br/><sup>b</sup>University of South Florida Morsani College of Medicine, Tampa<br/><br/><sup>c</sup>National Artificial Intelligence Institute, Washington, DC</em> </p> <h3> Author disclosures </h3> <p> <em>The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.</em> </p> <h3> Disclaimer </h3> <p> <em>The opinions expressed herein are those of the authors and do not necessarily reflect those of<i> Federal Practitioner,</i> Frontline Medical Communications Inc., the U.S. Government, or any of its agencies.</em> </p> <h3> References </h3> <p class="reference"> 1. Bill Gates. The age of AI has begun. March 21, 2023. Accessed May 10, 2023. https://www.gatesnotes.com/the-age-of-ai-has-begun <br/><br/> 2. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. <i>PLOS Digit Health</i>. 2023;2(2):e0000198. Published 2023 Feb 9. doi:10.1371/journal.pdig.0000198<br/><br/> 3. Shanafelt TD, West CP, Sinsky C, et al. Changes in burnout and satisfaction with work-life integration in physicians and the general US working population between 2011 and 2020. <i>Mayo Clin Proc</i>. 2022;97(3):491-506. doi:10.1016/j.mayocp.2021.11.021<br/><br/> 4. Goodman RS, Patrinely JR Jr, Osterman T, Wheless L, Johnson DB. On the cusp: considering the impact of artificial intelligence language models in healthcare. <i>Med</i>. 2023;4(3):139-140. doi:10.1016/j.medj.2023.02.008<br/><br/> 5. Will ChatGPT transform healthcare? <i>Nat Med</i>. 2023;29(3):505-506. doi:10.1038/s41591-023-02289-5<br/><br/> 6. Hopkins AM, Logan JM, Kichenadasse G, Sorich MJ. Artificial intelligence chatbots will revolutionize how cancer patients access information: ChatGPT represents a paradigm-shift. <i>JNCI Cancer Spectr</i>. 2023;7(2):pkad010. doi:10.1093/jncics/pkad010<br/><br/> 7. Babar Z, van Laarhoven T, Zanzotto FM, Marchiori E. Evaluating diagnostic content of AI-generated radiology reports of chest X-rays. <i>Artif Intell Med</i>. 2021;116:102075. doi:10.1016/j.artmed.2021.102075<br/><br/> 8. Lecler A, Duron L, Soyer P. Revolutionizing radiology with GPT-based models: current applications, future possibilities and limitations of ChatGPT. <i>Diagn Interv Imaging</i>. 2023;S2211-5684(23)00027-X. doi:10.1016/j.diii.2023.02.003<br/><br/> 9. Germain JM. Is ChatGPT smart enough to practice mental health therapy? March 23, 2023. Accessed May 11, 2023. https://www.technewsworld.com/story/is-chatgpt-smart-enough-to-practice-mental-health-therapy-178064.html<br/><br/>10. Cascella M, Montomoli J, Bellini V, Bignami E. Evaluating the feasibility of ChatGPT in healthcare: an analysis of multiple clinical and research scenarios. <i>J Med Syst</i>. 2023;47(1):33. Published 2023 Mar 4. doi:10.1007/s10916-023-01925-4<br/><br/>11. Jungwirth D, Haluza D. Artificial intelligence and public health: an exploratory study. <i>Int J Environ Res Public Health</i>. 2023;20(5):4541. Published 2023 Mar 3. doi:10.3390/ijerph20054541<br/><br/>12. Gilson A, Safranek CW, Huang T, et al. How does ChatGPT perform on the United States Medical Licensing Examination? The implications of large language models for medical education and knowledge assessment. <i>JMIR Med Educ</i>. 2023;9:e45312. Published 2023 Feb 8. doi:10.2196/45312<br/><br/>13. Eysenbach G. The role of ChatGPT, generative language models, and artificial intelligence in medical education: a conversation with ChatGPT and a call for papers. <i>JMIR Med Educ</i>. 2023;9:e46885. Published 2023 Mar 6. doi:10.2196/46885<br/><br/>14. Macdonald C, Adeloye D, Sheikh A, Rudan I. Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis. <i>J Glob Health</i>. 2023;13:01003. Published 2023 Feb 17. doi:10.7189/jogh.13.01003<br/><br/>15. Masters K. Ethical use of artificial intelligence in health professions education: AMEE Guide No.158. <i>Med Teach</i>. 2023;1-11. doi:10.1080/0142159X.2023.2186203<br/><br/>16. Smith CS. Hallucinations could blunt ChatGPT’s success. IEEE Spectrum. March 13, 2023. Accessed May 11, 2023. https://spectrum.ieee.org/ai-hallucination<br/><br/>17. Executive Office of the President, Office of Science and Technology Policy. Blueprint for an AI Bill of Rights. Accessed May 11, 2023. https://www.whitehouse.gov/ostp/ai-bill-of-rights<br/><br/>18. Executive office of the President. Executive Order 13960: promoting the use of trustworthy artificial intelligence in the federal government. <i>Fed Regist</i>. 2020;89(236):78939-78943.<br/><br/>19. US Department of Commerce, National institute of Standards and Technology. Artificial Intelligence Risk Management Framework (AI RMF 1.0). Published January 2023. doi:10.6028/NIST.AI.100-1<br/><br/>20. Microsoft. Azure Cognitive Search—Cloud Search Service. Accessed May 11, 2023. https://azure.microsoft.com/en-us/products/search<br/><br/>21. Aiyappa R, An J, Kwak H, Ahn YY. Can we trust the evaluation on ChatGPT? March 22, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.12767v1<br/><br/>22. Yang X, Chen A, Pournejatian N, et al. GatorTron: a large clinical language model to unlock patient information from unstructured electronic health records. Updated December 16, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2203.03540v3<br/><br/>23. Singhal K, Azizi S, Tu T, et al. Large language models encode clinical knowledge. December 26, 2022. Accessed May 11, 2023. https://arxiv.org/abs/2212.13138v1<br/><br/>24. Zakka C, Chaurasia A, Shad R, Hiesinger W. Almanac: knowledge-grounded language models for clinical medicine. March 1, 2023. Accessed May 11, 2023. https://arxiv.org/abs/2303.01229v1<br/><br/>25. NVIDIA. GatorTron-OG. Accessed May 11, 2023. https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og<br/><br/>26. Borkowski AA, Jakey CE, Thomas LB, Viswanadhan N, Mastorides SM. Establishing a hospital artificial intelligence committee to improve patient care. <i>Fed Pract</i>. 2022;39(8):334-336. doi:10.12788/fp.0299</p> </itemContent> </newsItem> </itemSet></root>
Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Conference Recap Checkbox
Not Conference Recap
Clinical Edge
Display the Slideshow in this Article
Medscape Article
Display survey writer
Reuters content
Disable Inline Native ads
WebMD Article
Article PDF Media

Establishing a Hospital Artificial Intelligence Committee to Improve Patient Care

Article Type
Changed
Tue, 08/09/2022 - 14:08

In the past 10 years, artificial intelligence (AI) applications have exploded in numerous fields, including medicine. Myriad publications report that the use of AI in health care is increasing, and AI has shown utility in many medical specialties, eg, pathology, radiology, and oncology.1,2

In cancer pathology, AI was able not only to detect various cancers, but also to subtype and grade them. In addition, AI could predict survival, the success of therapeutic response, and underlying mutations from histopathologic images.3 In other medical fields, AI applications are as notable. For example, in imaging specialties like radiology, ophthalmology, dermatology, and gastroenterology, AI is being used for image recognition, enhancement, and segmentation. In addition, AI is beneficial for predicting disease progression, survival, and response to therapy in other medical specialties. Finally, AI may help with administrative tasks like scheduling.

However, many obstacles to successfully implementing AI programs in the clinical setting exist, including clinical data limitations and ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI understanding.2 To address these barriers to successful clinical AI implementation, we decided to create a formal governing body at James A. Haley Veterans’ Hospital in Tampa, Florida. Accordingly, the hospital AI committee charter was officially approved on July 22, 2021. Our model could be used by both US Department of Veterans Affairs (VA) and non-VA hospitals throughout the country.

 

AI Committee

The vision of the AI committee is to improve outcomes and experiences for our veterans by developing trustworthy AI capabilities to support the VA mission. The mission is to build robust capacity in AI to create and apply innovative AI solutions and transform the VA by facilitating a learning environment that supports the delivery of world-class benefits and services to our veterans. Our vision and mission are aligned with the VA National AI Institute. 4

The AI Committee comprises 7 subcommittees: ethics, AI clinical product evaluation, education, data sharing and acquisition, research, 3D printing, and improvement and innovation. The role of the ethics subcommittee is to ensure the ethical and equitable implementation of clinical AI. We created the ethics subcommittee guidelines based on the World Health Organization ethics and governance of AI for health documents.5 They include 6 basic principles: protecting human autonomy; promoting human well-being and safety and the public interest; ensuring transparency, explainability, and intelligibility; fostering responsibility and accountability; ensuring inclusiveness and equity; and promoting AI that is responsive and sustainable (Table 1).

fdp03908334_t2.png

fdp03908334_t1.png


As the name indicates, the role of the AI clinical product evaluation subcommittee is to evaluate commercially available clinical AI products. More than 400 US Food and Drug Administration–approved AI medical applications exist, and the list is growing rapidly. Most AI applications are in medical imaging like radiology, dermatology, ophthalmology, and pathology.6,7 Each clinical product is evaluated according to 6 principles: relevance, usability, risks, regulatory, technical requirements, and financial (Table 2).8 We are in the process of evaluating a few commercial AI algorithms for pathology and radiology, using these 6 principles.

 

 

Implementations

After a comprehensive evaluation, we implemented 2 ClearRead (Riverain Technologies) AI radiology solutions. ClearRead CT Vessel Suppress produces a secondary series of computed tomography (CT) images, suppressing vessels and other normal structures within the lungs to improve nodule detectability, and ClearRead Xray Bone Suppress, which increases the visibility of soft tissue in standard chest X-rays by suppressing the bone on the digital image without the need for 2 exposures.

The role of the education subcommittee is to educate the staff about AI and how it can improve patient care. Every Friday, we email an AI article of the week to our practitioners. In addition, we publish a newsletter, and we organize an annual AI conference. The first conference in 2022 included speakers from the National AI Institute, Moffitt Cancer Center, the University of South Florida, and our facility.

As the name indicates, the data sharing and acquisition subcommittee oversees preparing data for our clinical and research projects. The role of the research subcommittee is to coordinate and promote AI research with the ultimate goal of improving patient care.

 

Other Technologies

Although 3D printing does not fall under the umbrella of AI, we have decided to include it in our future-oriented AI committee. We created an online 3D printing course to promote the technology throughout the VA. We 3D print organ models to help surgeons prepare for complicated operations. In addition, together with our colleagues from the University of Florida, we used 3D printing to address the shortage of swabs for COVID-19 testing. The VA Sunshine Healthcare Network (Veterans Integrated Services Network 8) has an active Innovation and Improvement Committee. 9 Our improvement and innovation subcommittee serves as a coordinating body with the network committee .

Conclusions

Through the hospital AI committee, we believe that we may overcome many obstacles to successfully implementing AI applications in the clinical setting, including the ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI knowledge.

Acknowledgments

This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.

Article PDF
Author and Disclosure Information

Andrew A. Borkowski, MDa,b; Colleen E. Jakey, MDa,b; L. Brannon Thomas, MD, PhDa,b; Narayan Viswanadhan, MDa,b, Stephen M. Mastorides, MDa,b
Correspondence:
Andrew Borkowski (andrew.borkowski@va.gov)

aJames A. Haley Veterans’ Hospital, Tampa, Florida
bUniversity of South Florida Morsani College of Medicine, Tampa

Author disclosures

The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.

Disclaimer

The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

References

1. Thomas LB, Mastorides SM, Viswanadhan N, Jakey CE, Borkowski AA. Artificial intelligence: review of current and future applications in medicine. Fed Pract. 2021;38(11):527-538. doi:10.12788/fp.0174

2. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat Med. 2022;28(1):31-38. doi:10.1038/s41591-021-01614-0

3. Echle A, Rindtorff NT, Brinker TJ, Luedde T, Pearson AT, Kather JN. Deep learning in cancer pathology: a new generation of clinical biomarkers. Br J Cancer. 2021;124(4):686-696. doi:10.1038/s41416-020-01122-x

4. US Department of Veterans Affairs, Office of Research and Development. National Artificial Intelligence Institute. Accessed April 13, 2022. https://www.research.va.gov/naii

5. World Health Organization. Ethics and governance of artificial intelligence for health. Updated June 6, 2022. Accessed June 24, 2022. https://www.who.int/publications/i/item/9789240029200

6. US Food and Drug Administration. Artificial intelligence and machine learning (AI/ML)-enabled medical devices. Updated September 22, 2021. Accessed June 24, 2022. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices

7. Muehlematter UJ, Daniore P, Vokinger KN. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015–20): a comparative analysis. The Lancet Digital Health. 2021;3(3):e195-e203. doi:10.1016/S2589-7500(20)30292-2/ATTACHMENT/C8457399-F5CE-4A30-8D36-2A9C835FB86D/MMC1.PDF

8. Omoumi P, Ducarouge A, Tournier A, et al. To buy or not to buy-evaluating commercial AI solutions in radiology (the ECLAIR guidelines). Eur Radiol. 2021;31(6):3786-3796. doi:10.1007/s00330-020-07684-x

9. US Department of Veterans Affairs. VA Sunshine Healthcare Network. Updated June 21, 2022. Accessed June 24, 2022. https://www.visn8.va.gov

Issue
Federal Practitioner - 39(8)a
Publications
Topics
Page Number
334-336
Sections
Author and Disclosure Information

Andrew A. Borkowski, MDa,b; Colleen E. Jakey, MDa,b; L. Brannon Thomas, MD, PhDa,b; Narayan Viswanadhan, MDa,b, Stephen M. Mastorides, MDa,b
Correspondence:
Andrew Borkowski (andrew.borkowski@va.gov)

aJames A. Haley Veterans’ Hospital, Tampa, Florida
bUniversity of South Florida Morsani College of Medicine, Tampa

Author disclosures

The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.

Disclaimer

The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

References

1. Thomas LB, Mastorides SM, Viswanadhan N, Jakey CE, Borkowski AA. Artificial intelligence: review of current and future applications in medicine. Fed Pract. 2021;38(11):527-538. doi:10.12788/fp.0174

2. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat Med. 2022;28(1):31-38. doi:10.1038/s41591-021-01614-0

3. Echle A, Rindtorff NT, Brinker TJ, Luedde T, Pearson AT, Kather JN. Deep learning in cancer pathology: a new generation of clinical biomarkers. Br J Cancer. 2021;124(4):686-696. doi:10.1038/s41416-020-01122-x

4. US Department of Veterans Affairs, Office of Research and Development. National Artificial Intelligence Institute. Accessed April 13, 2022. https://www.research.va.gov/naii

5. World Health Organization. Ethics and governance of artificial intelligence for health. Updated June 6, 2022. Accessed June 24, 2022. https://www.who.int/publications/i/item/9789240029200

6. US Food and Drug Administration. Artificial intelligence and machine learning (AI/ML)-enabled medical devices. Updated September 22, 2021. Accessed June 24, 2022. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices

7. Muehlematter UJ, Daniore P, Vokinger KN. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015–20): a comparative analysis. The Lancet Digital Health. 2021;3(3):e195-e203. doi:10.1016/S2589-7500(20)30292-2/ATTACHMENT/C8457399-F5CE-4A30-8D36-2A9C835FB86D/MMC1.PDF

8. Omoumi P, Ducarouge A, Tournier A, et al. To buy or not to buy-evaluating commercial AI solutions in radiology (the ECLAIR guidelines). Eur Radiol. 2021;31(6):3786-3796. doi:10.1007/s00330-020-07684-x

9. US Department of Veterans Affairs. VA Sunshine Healthcare Network. Updated June 21, 2022. Accessed June 24, 2022. https://www.visn8.va.gov

Author and Disclosure Information

Andrew A. Borkowski, MDa,b; Colleen E. Jakey, MDa,b; L. Brannon Thomas, MD, PhDa,b; Narayan Viswanadhan, MDa,b, Stephen M. Mastorides, MDa,b
Correspondence:
Andrew Borkowski (andrew.borkowski@va.gov)

aJames A. Haley Veterans’ Hospital, Tampa, Florida
bUniversity of South Florida Morsani College of Medicine, Tampa

Author disclosures

The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.

Disclaimer

The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

References

1. Thomas LB, Mastorides SM, Viswanadhan N, Jakey CE, Borkowski AA. Artificial intelligence: review of current and future applications in medicine. Fed Pract. 2021;38(11):527-538. doi:10.12788/fp.0174

2. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. Nat Med. 2022;28(1):31-38. doi:10.1038/s41591-021-01614-0

3. Echle A, Rindtorff NT, Brinker TJ, Luedde T, Pearson AT, Kather JN. Deep learning in cancer pathology: a new generation of clinical biomarkers. Br J Cancer. 2021;124(4):686-696. doi:10.1038/s41416-020-01122-x

4. US Department of Veterans Affairs, Office of Research and Development. National Artificial Intelligence Institute. Accessed April 13, 2022. https://www.research.va.gov/naii

5. World Health Organization. Ethics and governance of artificial intelligence for health. Updated June 6, 2022. Accessed June 24, 2022. https://www.who.int/publications/i/item/9789240029200

6. US Food and Drug Administration. Artificial intelligence and machine learning (AI/ML)-enabled medical devices. Updated September 22, 2021. Accessed June 24, 2022. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices

7. Muehlematter UJ, Daniore P, Vokinger KN. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015–20): a comparative analysis. The Lancet Digital Health. 2021;3(3):e195-e203. doi:10.1016/S2589-7500(20)30292-2/ATTACHMENT/C8457399-F5CE-4A30-8D36-2A9C835FB86D/MMC1.PDF

8. Omoumi P, Ducarouge A, Tournier A, et al. To buy or not to buy-evaluating commercial AI solutions in radiology (the ECLAIR guidelines). Eur Radiol. 2021;31(6):3786-3796. doi:10.1007/s00330-020-07684-x

9. US Department of Veterans Affairs. VA Sunshine Healthcare Network. Updated June 21, 2022. Accessed June 24, 2022. https://www.visn8.va.gov

Article PDF
Article PDF

In the past 10 years, artificial intelligence (AI) applications have exploded in numerous fields, including medicine. Myriad publications report that the use of AI in health care is increasing, and AI has shown utility in many medical specialties, eg, pathology, radiology, and oncology.1,2

In cancer pathology, AI was able not only to detect various cancers, but also to subtype and grade them. In addition, AI could predict survival, the success of therapeutic response, and underlying mutations from histopathologic images.3 In other medical fields, AI applications are as notable. For example, in imaging specialties like radiology, ophthalmology, dermatology, and gastroenterology, AI is being used for image recognition, enhancement, and segmentation. In addition, AI is beneficial for predicting disease progression, survival, and response to therapy in other medical specialties. Finally, AI may help with administrative tasks like scheduling.

However, many obstacles to successfully implementing AI programs in the clinical setting exist, including clinical data limitations and ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI understanding.2 To address these barriers to successful clinical AI implementation, we decided to create a formal governing body at James A. Haley Veterans’ Hospital in Tampa, Florida. Accordingly, the hospital AI committee charter was officially approved on July 22, 2021. Our model could be used by both US Department of Veterans Affairs (VA) and non-VA hospitals throughout the country.

 

AI Committee

The vision of the AI committee is to improve outcomes and experiences for our veterans by developing trustworthy AI capabilities to support the VA mission. The mission is to build robust capacity in AI to create and apply innovative AI solutions and transform the VA by facilitating a learning environment that supports the delivery of world-class benefits and services to our veterans. Our vision and mission are aligned with the VA National AI Institute. 4

The AI Committee comprises 7 subcommittees: ethics, AI clinical product evaluation, education, data sharing and acquisition, research, 3D printing, and improvement and innovation. The role of the ethics subcommittee is to ensure the ethical and equitable implementation of clinical AI. We created the ethics subcommittee guidelines based on the World Health Organization ethics and governance of AI for health documents.5 They include 6 basic principles: protecting human autonomy; promoting human well-being and safety and the public interest; ensuring transparency, explainability, and intelligibility; fostering responsibility and accountability; ensuring inclusiveness and equity; and promoting AI that is responsive and sustainable (Table 1).

fdp03908334_t2.png

fdp03908334_t1.png


As the name indicates, the role of the AI clinical product evaluation subcommittee is to evaluate commercially available clinical AI products. More than 400 US Food and Drug Administration–approved AI medical applications exist, and the list is growing rapidly. Most AI applications are in medical imaging like radiology, dermatology, ophthalmology, and pathology.6,7 Each clinical product is evaluated according to 6 principles: relevance, usability, risks, regulatory, technical requirements, and financial (Table 2).8 We are in the process of evaluating a few commercial AI algorithms for pathology and radiology, using these 6 principles.

 

 

Implementations

After a comprehensive evaluation, we implemented 2 ClearRead (Riverain Technologies) AI radiology solutions. ClearRead CT Vessel Suppress produces a secondary series of computed tomography (CT) images, suppressing vessels and other normal structures within the lungs to improve nodule detectability, and ClearRead Xray Bone Suppress, which increases the visibility of soft tissue in standard chest X-rays by suppressing the bone on the digital image without the need for 2 exposures.

The role of the education subcommittee is to educate the staff about AI and how it can improve patient care. Every Friday, we email an AI article of the week to our practitioners. In addition, we publish a newsletter, and we organize an annual AI conference. The first conference in 2022 included speakers from the National AI Institute, Moffitt Cancer Center, the University of South Florida, and our facility.

As the name indicates, the data sharing and acquisition subcommittee oversees preparing data for our clinical and research projects. The role of the research subcommittee is to coordinate and promote AI research with the ultimate goal of improving patient care.

 

Other Technologies

Although 3D printing does not fall under the umbrella of AI, we have decided to include it in our future-oriented AI committee. We created an online 3D printing course to promote the technology throughout the VA. We 3D print organ models to help surgeons prepare for complicated operations. In addition, together with our colleagues from the University of Florida, we used 3D printing to address the shortage of swabs for COVID-19 testing. The VA Sunshine Healthcare Network (Veterans Integrated Services Network 8) has an active Innovation and Improvement Committee. 9 Our improvement and innovation subcommittee serves as a coordinating body with the network committee .

Conclusions

Through the hospital AI committee, we believe that we may overcome many obstacles to successfully implementing AI applications in the clinical setting, including the ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI knowledge.

Acknowledgments

This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.

In the past 10 years, artificial intelligence (AI) applications have exploded in numerous fields, including medicine. Myriad publications report that the use of AI in health care is increasing, and AI has shown utility in many medical specialties, eg, pathology, radiology, and oncology.1,2

In cancer pathology, AI was able not only to detect various cancers, but also to subtype and grade them. In addition, AI could predict survival, the success of therapeutic response, and underlying mutations from histopathologic images.3 In other medical fields, AI applications are as notable. For example, in imaging specialties like radiology, ophthalmology, dermatology, and gastroenterology, AI is being used for image recognition, enhancement, and segmentation. In addition, AI is beneficial for predicting disease progression, survival, and response to therapy in other medical specialties. Finally, AI may help with administrative tasks like scheduling.

However, many obstacles to successfully implementing AI programs in the clinical setting exist, including clinical data limitations and ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI understanding.2 To address these barriers to successful clinical AI implementation, we decided to create a formal governing body at James A. Haley Veterans’ Hospital in Tampa, Florida. Accordingly, the hospital AI committee charter was officially approved on July 22, 2021. Our model could be used by both US Department of Veterans Affairs (VA) and non-VA hospitals throughout the country.

 

AI Committee

The vision of the AI committee is to improve outcomes and experiences for our veterans by developing trustworthy AI capabilities to support the VA mission. The mission is to build robust capacity in AI to create and apply innovative AI solutions and transform the VA by facilitating a learning environment that supports the delivery of world-class benefits and services to our veterans. Our vision and mission are aligned with the VA National AI Institute. 4

The AI Committee comprises 7 subcommittees: ethics, AI clinical product evaluation, education, data sharing and acquisition, research, 3D printing, and improvement and innovation. The role of the ethics subcommittee is to ensure the ethical and equitable implementation of clinical AI. We created the ethics subcommittee guidelines based on the World Health Organization ethics and governance of AI for health documents.5 They include 6 basic principles: protecting human autonomy; promoting human well-being and safety and the public interest; ensuring transparency, explainability, and intelligibility; fostering responsibility and accountability; ensuring inclusiveness and equity; and promoting AI that is responsive and sustainable (Table 1).

fdp03908334_t2.png

fdp03908334_t1.png


As the name indicates, the role of the AI clinical product evaluation subcommittee is to evaluate commercially available clinical AI products. More than 400 US Food and Drug Administration–approved AI medical applications exist, and the list is growing rapidly. Most AI applications are in medical imaging like radiology, dermatology, ophthalmology, and pathology.6,7 Each clinical product is evaluated according to 6 principles: relevance, usability, risks, regulatory, technical requirements, and financial (Table 2).8 We are in the process of evaluating a few commercial AI algorithms for pathology and radiology, using these 6 principles.

 

 

Implementations

After a comprehensive evaluation, we implemented 2 ClearRead (Riverain Technologies) AI radiology solutions. ClearRead CT Vessel Suppress produces a secondary series of computed tomography (CT) images, suppressing vessels and other normal structures within the lungs to improve nodule detectability, and ClearRead Xray Bone Suppress, which increases the visibility of soft tissue in standard chest X-rays by suppressing the bone on the digital image without the need for 2 exposures.

The role of the education subcommittee is to educate the staff about AI and how it can improve patient care. Every Friday, we email an AI article of the week to our practitioners. In addition, we publish a newsletter, and we organize an annual AI conference. The first conference in 2022 included speakers from the National AI Institute, Moffitt Cancer Center, the University of South Florida, and our facility.

As the name indicates, the data sharing and acquisition subcommittee oversees preparing data for our clinical and research projects. The role of the research subcommittee is to coordinate and promote AI research with the ultimate goal of improving patient care.

 

Other Technologies

Although 3D printing does not fall under the umbrella of AI, we have decided to include it in our future-oriented AI committee. We created an online 3D printing course to promote the technology throughout the VA. We 3D print organ models to help surgeons prepare for complicated operations. In addition, together with our colleagues from the University of Florida, we used 3D printing to address the shortage of swabs for COVID-19 testing. The VA Sunshine Healthcare Network (Veterans Integrated Services Network 8) has an active Innovation and Improvement Committee. 9 Our improvement and innovation subcommittee serves as a coordinating body with the network committee .

Conclusions

Through the hospital AI committee, we believe that we may overcome many obstacles to successfully implementing AI applications in the clinical setting, including the ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI knowledge.

Acknowledgments

This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.

Issue
Federal Practitioner - 39(8)a
Issue
Federal Practitioner - 39(8)a
Page Number
334-336
Page Number
334-336
Publications
Publications
Topics
Article Type
Sections
Teambase XML
<?xml version="1.0" encoding="UTF-8"?>
<!--$RCSfile: InCopy_agile.xsl,v $ $Revision: 1.35 $-->
<!--$RCSfile: drupal.xsl,v $ $Revision: 1.7 $-->
<root generator="drupal.xsl" gversion="1.7"> <header> <fileName>0822 FED AI</fileName> <TBEID>0C02A084.SIG</TBEID> <TBUniqueIdentifier>NJ_0C02A084</TBUniqueIdentifier> <newsOrJournal>Journal</newsOrJournal> <publisherName>Frontline Medical Communications Inc.</publisherName> <storyname>0822 FED AI</storyname> <articleType>1</articleType> <TBLocation>Copyfitting-FED</TBLocation> <QCDate/> <firstPublished>20220809T081534</firstPublished> <LastPublished>20220809T081534</LastPublished> <pubStatus qcode="stat:"/> <embargoDate/> <killDate/> <CMSDate>20220809T081533</CMSDate> <articleSource/> <facebookInfo/> <meetingNumber/> <byline/> <bylineText>Andrew A. Borkowski, MDa,b; Colleen E. Jakey, MDa,b; L. Brannon Thomas, MD, PhDa,b; Narayan Viswanadhan, MDa,b; and Stephen M. Mastorides, MDa,b</bylineText> <bylineFull/> <bylineTitleText/> <USOrGlobal/> <wireDocType/> <newsDocType/> <journalDocType/> <linkLabel/> <pageRange/> <citation/> <quizID/> <indexIssueDate/> <itemClass qcode="ninat:text"/> <provider qcode="provider:"> <name/> <rightsInfo> <copyrightHolder> <name/> </copyrightHolder> <copyrightNotice/> </rightsInfo> </provider> <abstract/> <metaDescription>Background: The use of artificial intelligence (AI) in health care is increasing and has shown utility in many medical specialties, especially pathology, radiol</metaDescription> <articlePDF/> <teaserImage/> <title>Establishing a Hospital Artificial Intelligence Committee to Improve Patient Care</title> <deck/> <eyebrow>Commentary</eyebrow> <disclaimer/> <AuthorList/> <articleURL/> <doi/> <pubMedID/> <publishXMLStatus/> <publishXMLVersion>1</publishXMLVersion> <useEISSN>0</useEISSN> <urgency/> <pubPubdateYear>2022</pubPubdateYear> <pubPubdateMonth>August</pubPubdateMonth> <pubPubdateDay/> <pubVolume>39</pubVolume> <pubNumber>8</pubNumber> <wireChannels/> <primaryCMSID>2951</primaryCMSID> <CMSIDs> <CMSID>2951</CMSID> <CMSID>2905</CMSID> <CMSID>3639</CMSID> </CMSIDs> <keywords/> <seeAlsos/> <publications_g> <publicationData> <publicationCode>FED</publicationCode> <pubIssueName>August 2022</pubIssueName> <pubArticleType>Feature Articles | 3639</pubArticleType> <pubTopics> <pubTopic>Technologies | 2905</pubTopic> </pubTopics> <pubCategories/> <pubSections> <pubSection>Feature | 2951<pubSubsection/></pubSection> </pubSections> <journalTitle>Fed Pract</journalTitle> <journalFullTitle>Federal Practitioner</journalFullTitle> <copyrightStatement>Copyright 2017 Frontline Medical Communications Inc., Parsippany, NJ, USA. All rights reserved.</copyrightStatement> </publicationData> </publications_g> <publications> <term canonical="true">16</term> </publications> <sections> <term canonical="true">52</term> </sections> <topics> <term>263</term> <term>27442</term> <term canonical="true">327</term> </topics> <links/> </header> <itemSet> <newsItem> <itemMeta> <itemRole>Main</itemRole> <itemClass>text</itemClass> <title>Establishing a Hospital Artificial Intelligence Committee to Improve Patient Care</title> <deck/> </itemMeta> <itemContent> <p> <b>Background: </b> The use of artificial intelligence (AI) in health care is increasing and has shown utility in many medical specialties, especially pathology, radiology, and oncology. <br/><br/> <b>Observations: </b> Many barriers exist to successfully implement AI programs in the clinical setting. To address these barriers, a formal governing body, the hospital AI Committee, was created at James A. Haley Veterans’ Hospital in Tampa, Florida. The AI committee reviews and assesses AI products based on their success at protecting human autonomy; promoting human well-being and safety and the public interest; ensuring transparency, explainability, and intelligibility; fostering responsibility and accountability; ensuring inclusiveness and equity; and promoting AI that is responsive and sustainable.<br/><br/> <b>Conclusions: </b> Through the hospital AI Committee, we may overcome many obstacles to successfully implementing AI applications in the clinical setting. </p> <p><span class="dropcap">I</span>n the past 10 years, artificial intelligence (AI) applications have exploded in numerous fields, including medicine. Myriad publications report that the use of AI in health care is increasing, and AI has shown utility in many medical specialties, eg, pathology, radiology, and oncology.<sup>1,2</sup></p> <p>In cancer pathology, AI was able not only to detect various cancers, but also to subtype and grade them. In addition, AI could predict survival, the success of therapeutic response, and underlying mutations from histopathologic images.<sup>3</sup> In other medical fields, AI applications are as notable. For example, in imaging specialties like radiology, ophthalmology, dermatology, and gastroenterology, AI is being used for image recognition, enhancement, and segmentation. In addition, AI is beneficial for predicting disease progression, survival, and response to therapy in other medical specialties. Finally, AI may help with administrative tasks like scheduling. <br/><br/>However, many obstacles to successfully implementing AI programs in the clinical setting exist, including clinical data limitations and ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI understanding.<sup>2</sup> To address these barriers to successful clinical AI implementation, we decided to create a formal governing body at James A. Haley Veterans’ Hospital in Tampa, Florida. Accordingly, the hospital AI committee charter was officially approved on July 22, 2021. Our model could be used by both US Department of Veterans Affairs (VA) and non-VA hospitals throughout the country. </p> <h2>AI Committee</h2> <p> The vision of the AI committee is to improve outcomes and experiences for our veterans by developing trustworthy AI capabilities to support the VA mission. The mission is to build robust capacity in AI to create and apply innovative AI solutions and transform the VA by facilitating a learning environment that supports the delivery of world-class benefits and services to our veterans. Our vision and mission are aligned with the VA National AI Institute. <sup>4</sup> </p> <p>The AI Committee comprises 7 subcommittees: ethics, AI clinical product evaluation, education, data sharing and acquisition, research, 3D printing, and improvement and innovation. The role of the ethics subcommittee is to ensure the ethical and equitable implementation of clinical AI. We created the ethics subcommittee guidelines based on the World Health Organization ethics and governance of AI for health documents.<sup>5</sup> They include 6 basic principles: protecting human autonomy; promoting human well-being and safety and the public interest; ensuring transparency, explainability, and intelligibility; fostering responsibility and accountability; ensuring inclusiveness and equity; and promoting AI that is responsive and sustainable (Table 1).<br/><br/>As the name indicates, the role of the AI clinical product evaluation subcommittee is to evaluate commercially available clinical AI products. More than 400 US Food and Drug Administration–approved AI medical applications exist, and the list is growing rapidly. Most AI applications are in medical imaging like radiology, dermatology, ophthalmology, and pathology.<sup>6,7</sup> Each clinical product is evaluated according to 6 principles: relevance, usability, risks, regulatory, technical requirements, and financial (Table 2).<sup>8</sup> We are in the process of evaluating a few commercial AI algorithms for pathology and radiology, using these 6 principles. </p> <h3>Implementations</h3> <p> After a comprehensive evaluation, we implemented 2 ClearRead (Riverain Technologies) AI radiology solutions. ClearRead CT Vessel Suppress produces a secondary series of computed tomography (CT) images, suppressing vessels and other normal structures within the lungs to improve nodule detectability, and ClearRead Xray Bone Suppress, which increases the visibility of soft tissue in standard chest X-rays by suppressing the bone on the digital image without the need for 2 exposures. <sup> </sup> </p> <p>The role of the education subcommittee is to educate the staff about AI and how it can improve patient care. Every Friday, we email an AI article of the week to our practitioners. In addition, we publish a newsletter, and we organize an annual AI conference. The first conference in 2022 included speakers from the National AI Institute, Moffitt Cancer Center, the University of South Florida, and our facility.<br/><br/>As the name indicates, the data sharing and acquisition subcommittee oversees preparing data for our clinical and research projects. The role of the research subcommittee is to coordinate and promote AI research with the ultimate goal of improving patient care.</p> <h3>Other Technologies</h3> <p> Although 3D printing does not fall under the umbrella of AI, we have decided to include it in our future-oriented AI committee. We created an online 3D printing course to promote the technology throughout the VA. We 3D print organ models to help surgeons prepare for complicated operations. In addition, together with our colleagues from the University of Florida, we used 3D printing to address the shortage of swabs for COVID-19 testing. The VA Sunshine Healthcare Network (Veterans Integrated Services Network 8) has an active Innovation and Improvement Committee. <sup>9</sup> Our improvement and innovation subcommittee serves as a coordinating body with the network committee . </p> <h2>conclusions</h2> <p>Through the hospital AI committee, we believe that we may overcome many obstacles to successfully implementing AI applications in the clinical setting, including the ethical use of data, trust in the AI models, regulatory barriers, and lack of clinical buy-in due to insufficient basic AI knowledge.</p> <p class="isub">Acknowledgments</p> <p> <em>This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.</em> </p> <p class="isub">Author affiliations</p> <p> <em>aJames A. Haley Veterans’ Hospital, Tampa, Florida<br/><br/><sup>b</sup>University of South Florida Morsani College of Medicine, Tampa</em> </p> <p class="isub">Author disclosures </p> <p> <em>The authors report no actual or potential conflicts of interest or outside sources of funding with regard to this article.</em> </p> <p class="isub">Disclaimer</p> <p> <em>The opinions expressed herein are those of the authors and do not necessarily reflect those of <i>Federal Practitioner</i>, Frontline Medical Communications Inc., the US Government, or any of its agencies. </em> </p> <p class="isub">References</p> <p class="reference"> 1. Thomas LB, Mastorides SM, Viswanadhan N, Jakey CE, Borkowski AA. Artificial intelligence: review of current and future applications in medicine. <i>Fed Pract.</i> 2021;38(11):527-538. doi:10.12788/fp.0174<br/><br/> 2. Rajpurkar P, Chen E, Banerjee O, Topol EJ. AI in health and medicine. <i>Nat Med. </i>2022;28(1):31-38. doi:10.1038/s41591-021-01614-0<br/><br/> 3. Echle A, Rindtorff NT, Brinker TJ, Luedde T, Pearson AT, Kather JN. Deep learning in cancer pathology: a new generation of clinical biomarkers. <i>Br J Cancer. </i>2021;124(4):686-696. doi:10.1038/s41416-020-01122-x<br/><br/> 4. US Department of Veterans Affairs, Office of Research and Development. National Artificial Intelligence Institute. Accessed April 13, 2022. https://www.research.va.gov/naii<br/><br/> 5. World Health Organization. Ethics and governance of artificial intelligence for health. Updated June 6, 2022. Accessed June 24, 2022. https://www.who.int/publications/i/item/9789240029200<br/><br/> 6. US Food and Drug Administration. Artificial intelligence and machine learning (AI/ML)-enabled medical devices. Updated September 22, 2021. Accessed June 24, 2022. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices<br/><br/> 7. Muehlematter UJ, Daniore P, Vokinger KN. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015–20): a comparative analysis. <i>The Lancet Digital Health.</i> 2021;3(3):e195-e203. doi:10.1016/S2589-7500(20)30292-2/ATTACHMENT/C8457399-F5CE-4A30-8D36-2A9C835FB86D/MMC1.PDF<br/><br/> 8. Omoumi P, Ducarouge A, Tournier A, et al. To buy or not to buy-evaluating commercial AI solutions in radiology (the ECLAIR guidelines). <i>Eur Radiol.</i> 2021;31(6):3786-3796. doi:10.1007/s00330-020-07684-x<br/><br/> 9. US Department of Veterans Affairs. VA Sunshine Healthcare Network. Updated June 21, 2022. Accessed June 24, 2022. https://www.visn8.va.gov</p> </itemContent> </newsItem> </itemSet></root>
Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Conference Recap Checkbox
Not Conference Recap
Clinical Edge
Display the Slideshow in this Article
Medscape Article
Display survey writer
Reuters content
Disable Inline Native ads
WebMD Article
Article PDF Media

Artificial Intelligence: Review of Current and Future Applications in Medicine

Article Type
Changed
Mon, 11/08/2021 - 15:36

Artificial Intelligence (AI) was first described in 1956 and refers to machines having the ability to learn as they receive and process information, resulting in the ability to “think” like humans.1 AI’s impact in medicine is increasing; currently, at least 29 AI medical devices and algorithms are approved by the US Food and Drug Administration (FDA) in a variety of areas, including radiograph interpretation, managing glucose levels in patients with diabetes mellitus, analyzing electrocardiograms (ECGs), and diagnosing sleep disorders among others.2 Significantly, in 2020, the Centers for Medicare and Medicaid Services (CMS) announced the first reimbursement to hospitals for an AI platform, a model for early detection of strokes.3 AI is rapidly becoming an integral part of health care, and its role will only increase in the future (Table).

fdp03811527_t.png

As knowledge in medicine is expanding exponentially, AI has great potential to assist with handling complex patient care data. The concept of exponential growth is not a natural one. As Bini described, with exponential growth the volume of knowledge amassed over the past 10 years will now occur in perhaps only 1 year.1 Likewise, equivalent advances over the past year may take just a few months. This phenomenon is partly due to the law of accelerating returns, which states that advances feed on themselves, continually increasing the rate of further advances.4 The volume of medical data doubles every 2 to 5 years.5 Fortunately, the field of AI is growing exponentially as well and can help health care practitioners (HCPs) keep pace, allowing the continued delivery of effective health care.

In this report, we review common terminology, principles, and general applications of AI, followed by current and potential applications of AI for selected medical specialties. Finally, we discuss AI’s future in health care, along with potential risks and pitfalls.

 

AI Overview

AI refers to machine programs that can “learn” or think based on past experiences. This functionality contrasts with simple rules-based programming available to health care for years. An example of rules-based programming is the warfarindosing.org website developed by Barnes-Jewish Hospital at Washington University Medical Center, which guides initial warfarin dosing.6,7 The prescriber inputs detailed patient information, including age, sex, height, weight, tobacco history, medications, laboratory results, and genotype if available. The application then calculates recommended warfarin dosing regimens to avoid over- or underanticoagulation. While the dosing algorithm may be complex, it depends entirely on preprogrammed rules. The program does not learn to reach its conclusions and recommendations from patient data.

In contrast, one of the most common subsets of AI is machine learning (ML). ML describes a program that “learns from experience and improves its performance as it learns.”1 With ML, the computer is initially provided with a training data set—data with known outcomes or labels. Because the initial data are input from known samples, this type of AI is known as supervised learning.8-10 As an example, we recently reported using ML to diagnose various types of cancer from pathology slides.11 In one experiment, we captured images of colon adenocarcinoma and normal colon (these 2 groups represent the training data set). Unlike traditional programming, we did not define characteristics that would differentiate colon cancer from normal; rather, the machine learned these characteristics independently by assessing the labeled images provided. A second data set (the validation data set) was used to evaluate the program and fine-tune the ML training model’s parameters. Finally, the program was presented with new images of cancer and normal cases for final assessment of accuracy (test data set). Our program learned to recognize differences from the images provided and was able to differentiate normal and cancer images with > 95% accuracy.

Advances in computer processing have allowed for the development of artificial neural networks (ANNs). While there are several types of ANNs, the most common types used for image classification and segmentation are known as convolutional neural networks (CNNs).9,12-14 The programs are designed to work similar to the human brain, specifically the visual cortex.15,16 As data are acquired, they are processed by various layers in the program. Much like neurons in the brain, one layer decides whether to advance information to the next.13,14 CNNs can be many layers deep, leading to the term deep learning: “computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction.”1,13,17

ANNs can process larger volumes of data. This advance has led to the development of unstructured or unsupervised learning. With this type of learning, imputing defined features (ie, predetermined answers) of the training data set described above is no longer required.1,8,10,14 The advantage of unsupervised learning is that the program can be presented raw data and extract meaningful interpretation without human input, often with less bias than may exist with supervised learning.1,18 If shown enough data, the program can extract relevant features to make conclusions independently without predefined definitions, potentially uncovering markers not previously known. For example, several studies have used unsupervised learning to search patient data to assess readmission risks of patients with congestive heart failure.10,19,20 AI compiled features independently and not previously defined, predicting patients at greater risk for readmission superior to traditional methods.

fdp03811527_f.png


A more detailed description of the various terminologies and techniques of AI is beyond the scope of this review.9,10,17,21 However, in this basic overview, we describe 4 general areas that AI impacts health care (Figure).

 

 

Health Care Applications

Image analysis has seen the most AI health care applications.8,15 AI has shown potential in interpreting many types of medical images, including pathology slides, radiographs of various types, retina and other eye scans, and photographs of skin lesions. Many studies have demonstrated that AI can interpret these images as accurately as or even better than experienced clinicians.9,13,22-29 Studies have suggested AI interpretation of radiographs may better distinguish patients infected with COVID-19 from other causes of pneumonia, and AI interpretation of pathology slides may detect specific genetic mutations not previously identified without additional molecular tests.11,14,23,24,30-32

The second area in which AI can impact health care is improving workflow and efficiency. AI has improved surgery scheduling, saving significant revenue, and decreased patient wait times for appointments.1 AI can screen and triage radiographs, allowing attention to be directed to critical patients. This use would be valuable in many busy clinical settings, such as the recent COVID-19 pandemic.8,23 Similarly, AI can screen retina images to prioritize urgent conditions.25 AI has improved pathologists’ efficiency when used to detect breast metastases.33 Finally, AI may reduce medical errors, thereby ensuring patient safety.8,9,34

A third health care benefit of AI is in public health and epidemiology. AI can assist with clinical decision-making and diagnoses in low-income countries and areas with limited health care resources and personnel.25,29 AI can improve identification of infectious outbreaks, such as tuberculosis, malaria, dengue fever, and influenza.29,35-40 AI has been used to predict transmission patterns of the Zika virus and the current COVID-19 pandemic.41,42 Applications can stratify the risk of outbreaks based on multiple factors, including age, income, race, atypical geographic clusters, and seasonal factors like rainfall and temperature.35,36,38,43 AI has been used to assess morbidity and mortality, such as predicting disease severity with malaria and identifying treatment failures in tuberculosis.29

Finally, AI can dramatically impact health care due to processing large data sets or disconnected volumes of patient information—so-called big data.44-46 An example is the widespread use of electronic health records (EHRs) such as the Computerized Patient Record System used in Veteran Affairs medical centers (VAMCs). Much of patient information exists in written text: HCP notes, laboratory and radiology reports, medication records, etc. Natural language processing (NLP) allows platforms to sort through extensive volumes of data on complex patients at rates much faster than human capability, which has great potential to assist with diagnosis and treatment decisions.9

Medical literature is being produced at rates that exceed our ability to digest. More than 200,000 cancer-related articles were published in 2019 alone.14 NLP capabilities of AI have the potential to rapidly sort through this extensive medical literature and relate specific verbiage in patient records guiding therapy.46 IBM Watson, a supercomputer based on ML and NLP, demonstrates this concept with many potential applications, only some of which relate to health care.1,9 Watson has an oncology component to assimilate multiple aspects of patient care, including clinical notes, pathology results, radiograph findings, staging, and a tumor’s genetic profile. It coordinates these inputs from the EHR and mines medical literature and research databases to recommend treatment options.1,46 AI can assess and compile far greater patient data and therapeutic options than would be feasible by individual clinicians, thus providing customized patient care.47 Watson has partnered with numerous medical centers, including MD Anderson Cancer Center and Memorial Sloan Kettering Cancer Center, with variable success.44,47-49 While the full potential of Watson appears not yet realized, these AI-driven approaches will likely play an important role in leveraging the hidden value in the expanding volume of health care information.

Medical Specialty Applications

Radiology

Currently > 70% of FDA-approved AI medical devices are in the field of radiology.2 Most radiology departments have used AI-friendly digital imaging for years, such as the picture archiving and communication systems used by numerous health care systems, including VAMCs.2,15 Gray-scale images common in radiology lend themselves to standardization, although AI is not limited to black-and- white image interpretation.15

An abundance of literature describes plain radiograph interpretation using AI. One FDA-approved platform improved X-ray diagnosis of wrist fractures when used by emergency medicine clinicians.2,50 AI has been applied to chest X-ray (CXR) interpretation of many conditions, including pneumonia, tuberculosis, malignant lung lesions, and COVID-19.23,25,28,44,51-53 For example, Nam and colleagues suggested AI is better at diagnosing malignant pulmonary nodules from CXRs than are trained radiologists.28

In addition to plain radiographs, AI has been applied to many other imaging technologies, including ultrasounds, positron emission tomography, mammograms, computed tomography (CT), and magnetic resonance imaging (MRI).15,26,44,48,54-56 A large study demonstrated that ML platforms significantly reduced the time to diagnose intracranial hemorrhages on CT and identified subtle hemorrhages missed by radiologists.55 Other studies have claimed that AI programs may be better than radiologists in detecting cancer in screening mammograms, and 3 FDA-approved devices focus on mammogram interpretation.2,15,54,57 There is also great interest in MRI applications to detect and predict prognosis for breast cancer based on imaging findings.21,56

Aside from providing accurate diagnoses, other studies focus on AI radiograph interpretation to assist with patient screening, triage, improving time to final diagnosis, providing a rapid “second opinion,” and even monitoring disease progression and offering insights into prognosis.8,21,23,52,55,56,58 These features help in busy urban centers but may play an even greater role in areas with limited access to health care or trained specialists such as radiologists.52

 

 

Cardiology

Cardiology has the second highest number of FDA-approved AI applications.2 Many cardiology AI platforms involve image analysis, as described in several recent reviews.45,59,60 AI has been applied to echocardiography to measure ejection fractions, detect valvular disease, and assess heart failure from hypertrophic and restrictive cardiomyopathy and amyloidosis.45,48,59 Applications for cardiac CT scans and CT angiography have successfully quantified both calcified and noncalcified coronary artery plaques and lumen assessments, assessed myocardial perfusion, and performed coronary artery calcium scoring.45,59,60 Likewise, AI applications for cardiac MRI have been used to quantitate ejection fraction, large vessel flow assessment, and cardiac scar burden.45,59

For years ECG devices have provided interpretation with limited accuracy using preprogrammed parameters.48 However, the application of AI allows ECG interpretation on par with trained cardiologists. Numerous such AI applications exist, and 2 FDA-approved devices perform ECG interpretation.2,61-64 One of these devices incorporates an AI-powered stethoscope to detect atrial fibrillation and heart murmurs.65

Pathology

The advancement of whole slide imaging, wherein entire slides can be scanned and digitized at high speed and resolution, creates great potential for AI applications in pathology.12,24,32,33,66 A landmark study demonstrating the potential of AI for assessing whole slide imaging examined sentinel lymph node metastases in patients with breast cancer.22 Multiple algorithms in the study demonstrated that AI was equivalent or better than pathologists in detecting metastases, especially when the pathologists were time-constrained consistent with a normal working environment. Significantly, the most accurate and efficient diagnoses were achieved when the pathologist and AI interpretations were used together.22,33

AI has shown promise in diagnosing many other entities, including cancers of the prostate (including Gleason scoring), lung, colon, breast, and skin.11,12,24,27,32,67 In addition, AI has shown great potential in scoring biomarkers important for prognosis and treatment, such as immunohistochemistry (IHC) labeling of Ki-67 and PD-L1.32 Pathologists can have difficulty classifying certain tumors or determining the site of origin for metastases, often having to rely on IHC with limited success. The unique features of image analysis with AI have the potential to assist in classifying difficult tumors and identifying sites of origin for metastatic disease based on morphology alone.11

Oncology depends heavily on molecular pathology testing to dictate treatment options and determine prognosis. Preliminary studies suggest that AI interpretation alone has the potential to delineate whether certain molecular mutations are present in tumors from various sites.11,14,24,32 One study combined histology and genomic results for AI interpretation that improved prognostic predictions.68 In addition, AI analysis may have potential in predicting tumor recurrence or prognosis based on cellular features, as demonstrated for lung cancer and melanoma.67,69,70

Ophthalmology

AI applications for ophthalmology have focused on diabetic retinopathy, age-related macular degeneration, glaucoma, retinopathy of prematurity, age-related and congenital cataracts, and retinal vein occlusion.71-73 Diabetic retinopathy is a leading cause of blindness and has been studied by numerous platforms with good success, most having used color fundus photography.71,72 One study showed AI could diagnose diabetic retinopathy and diabetic macular edema with specificities similar to ophthalmologists.74 In 2018, the FDA approved the AI platform IDx-DR. This diagnostic system classifies retinal images and recommends referral for patients determined to have “more than mild diabetic retinopathy” and reexamination within a year for other patients.8,75 Significantly, the platform recommendations do not require confirmation by a clinician.8

AI has been applied to other modalities in ophthalmology such as optical coherence tomography (OCT) to diagnose retinal disease and to predict appropriate management of congenital cataracts.25,73,76 For example, an AI application using OCT has been demonstrated to match or exceed the accuracy of retinal experts in diagnosing and triaging patients with a variety of retinal pathologies, including patients needing urgent referrals.77

Dermatology

Multiple studies demonstrate AI performs at least equal to experienced dermatologists in differentiating selected skin lesions.78-81 For example, Esteva and colleagues demonstrated AI could differentiate keratinocyte carcinomas from benign seborrheic keratoses and malignant melanomas from benign nevi with accuracy equal to 21 board-certified dermatologists.78

 

 

AI is applicable to various imaging procedures common to dermatology, such as dermoscopy, very high-frequency ultrasound, and reflectance confocal microscopy.82 Several studies have demonstrated that AI interpretation compared favorably to dermatologists evaluating dermoscopy to assess melanocytic lesions.78-81,83

A limitation in these studies is that they differentiate only a few diagnoses.82 Furthermore, dermatologists have sensory input such as touch and visual examination under various conditions, something AI has yet to replicate.15,34,84 Also, most AI devices use no or limited clinical information.81 Dermatologists can recognize rarer conditions for which AI models may have had limited or no training.34 Nevertheless, a recent study assessed AI for the diagnosis of 134 separate skin disorders with promising results, including providing diagnoses with accuracy comparable to that of dermatologists and providing accurate treatment strategies.84 As Topol points out, most skin lesions are diagnosed in the primary care setting where AI can have a greater impact when used in conjunction with the clinical impression, especially where specialists are in limited supply.48,78

Finally, dermatology lends itself to using portable or smartphone applications (apps) wherein the user can photograph a lesion for analysis by AI algorithms to assess the need for further evaluation or make treatment recommendations.34,84,85 Although results from currently available apps are not encouraging, they may play a greater role as the technology advances.34,85

 

Oncology

Applications of AI in oncology include predicting prognosis for patients with cancer based on histologic and/or genetic information.14,68,86 Programs can predict the risk of complications before and recurrence risks after surgery for malignancies.44,87-89 AI can also assist in treatment planning and predict treatment failure with radiation therapy.90,91

AI has great potential in processing the large volumes of patient data in cancer genomics. Next-generation sequencing has allowed for the identification of millions of DNA sequences in a single tumor to detect genetic anomalies.92 Thousands of mutations can be found in individual tumor samples, and processing this information and determining its significance can be beyond human capability.14 We know little about the effects of various mutation combinations, and most tumors have a heterogeneous molecular profile among different cell populations.14,93 The presence or absence of various mutations can have diagnostic, prognostic, and therapeutic implications.93 AI has great potential to sort through these complex data and identify actionable findings.

More than 200,000 cancer-related articles were published in 2019, and publications in the field of cancer genomics are increasing exponentially.14,92,93 Patel and colleagues assessed the utility of IBM Watson for Genomics against results from a molecular tumor board.93 Watson for Genomics identified potentially significant mutations not identified by the tumor board in 32% of patients. Most mutations were related to new clinical trials not yet added to the tumor board watch list, demonstrating the role AI will have in processing the large volume of genetic data required to deliver personalized medicine moving forward.

Gastroenterology

AI has shown promise in predicting risk or outcomes based on clinical parameters in various common gastroenterology problems, including gastric reflux, acute pancreatitis, gastrointestinal bleeding, celiac disease, and inflammatory bowel disease.94,95 AI endoscopic analysis has demonstrated potential in assessing Barrett’s esophagus, gastric Helicobacter pylori infections, gastric atrophy, and gastric intestinal metaplasia.95 Applications have been used to assess esophageal, gastric, and colonic malignancies, including depth of invasion based on endoscopic images.95 Finally, studies have evaluated AI to assess small colon polyps during colonoscopy, including differentiating benign and premalignant polyps with success comparable to gastroenterologists.94,95 AI has been shown to increase the speed and accuracy of gastroenterologists in detecting small polyps during colonoscopy.48 In a prospective randomized study, colonoscopies performed using an AI device identified significantly more small adenomatous polyps than colonoscopies without AI.96

Neurology

It has been suggested that AI technologies are well suited for application in neurology due to the subtle presentation of many neurologic diseases.16 Viz LVO, the first CMS-approved AI reimbursement for the diagnosis of strokes, analyzes CTs to detect early ischemic strokes and alerts the medical team, thus shortening time to treatment.3,97 Many other AI platforms are in use or development that use CT and MRI for the early detection of strokes as well as for treatment and prognosis.9,97

AI technologies have been applied to neurodegenerative diseases, such as Alzheimer and Parkinson diseases.16,98 For example, several studies have evaluated patient movements in Parkinson disease for both early diagnosis and to assess response to treatment.98 These evaluations included assessment with both external cameras as well as wearable devices and smartphone apps.

 

 



AI has also been applied to seizure disorders, attempting to determine seizure type, localize the area of seizure onset, and address the challenges of identifying seizures in neonates.99,100 Other potential applications range from early detection and prognosis predictions for cases of multiple sclerosis to restoring movement in paralysis from a variety of conditions such as spinal cord injury.9,101,102
 

 

Mental Health

Due to the interactive nature of mental health care, the field has been slower to develop AI applications.18 With heavy reliance on textual information (eg, clinic notes, mood rating scales, and documentation of conversations), successful AI applications in this field will likely rely heavily on NLP.18 However, studies investigating the application of AI to mental health have also incorporated data such as brain imaging, smartphone monitoring, and social media platforms, such as Facebook and Twitter.18,103,104

The risk of suicide is higher in veteran patients, and ML algorithms have had limited success in predicting suicide risk in both veteran and nonveteran populations.104-106 While early models have low positive predictive values and low sensitivities, they still promise to be a useful tool in conjunction with traditional risk assessments.106 Kessler and colleagues suggest that combining multiple rather than single ML algorithms might lead to greater success.105,106

AI may assist in diagnosing other mental health disorders, including major depressive disorder, attention deficit hyperactivity disorder (ADHD), schizophrenia, posttraumatic stress disorder, and Alzheimer disease.103,104,107 These investigations are in the early stages with limited clinical applicability. However, 2 AI applications awaiting FDA approval relate to ADHD and opioid use.2 Furthermore, potential exists for AI to not only assist with prevention and diagnosis of ADHD, but also to identify optimal treatment options.2,103

General and Personalized Medicine

Additional AI applications include diagnosing patients with suspected sepsis, measuring liver iron concentrations, predicting hospital mortality at the time of admission, and more.2,108,109 AI can guide end-of-life decisions such as resuscitation status or whether to initiate mechanical ventilation.48

AI-driven smartphone apps can be beneficial to both patients and clinicians. Examples include predicting nonadherence to anticoagulation therapy, monitoring heart rhythms for atrial fibrillation or signs of hyperkalemia in patients with renal failure, and improving outcomes for patients with diabetes mellitus by decreasing glycemic variability and reducing hypoglycemia.8,48,110,111 The potential for AI applications to health care and personalized medicine are almost limitless.

Discussion

With ever-increasing expectations for all health care sectors to deliver timely, fiscally-responsible, high-quality health care, AI has the potential to have numerous impacts. AI can improve diagnostic accuracy while limiting errors and impact patient safety such as assisting with prescription delivery.8,9,34 It can screen and triage patients, alerting clinicians to those needing more urgent evaluation.8,23,77,97 AI also may increase a clinician’s efficiency and speed to render a diagnosis.12,13,55,97 AI can provide a rapid second opinion, an ability especially beneficial in underserved areas with shortages of specialists.23,25,26,29,34 Similarly, AI may decrease the inter- and intraobserver variability common in many medical specialties.12,27,45 AI applications can also monitor disease progression, identifying patients at greatest risk, and provide information for prognosis.21,23,56,58 Finally, as described with applications using IBM Watson, AI can allow for an integrated approach to health care that is currently lacking.

We have described many reports suggesting AI can render diagnoses as well as or better than experienced clinicians, and speculation exists that AI will replace many roles currently performed by health care practitioners.9,26 However, most studies demonstrate that AI’s diagnostic benefits are best realized when used to supplement a clinician’s impression.8,22,30,33,52,54,56,69,84 AI is not likely to replace humans in health care in the foreseeable future. The technology can be likened to the impact of CT scans developed in the 1970s in neurology. Prior to such detailed imaging, neurologists spent extensive time performing detailed physicals to render diagnoses and locate lesions before surgery. There was mistrust of this new technology and concern that CT scans would eliminate the need for neurologists.112 On the contrary, neurology is alive and well, frequently being augmented by the technologies once speculated to replace it.

Commercial AI health care platforms represented a $2 billion industry in 2018 and are growing rapidly each year.13,32 Many AI products are offered ready for implementation for various tasks, including diagnostics, patient management, and improved efficiency. Others will likely be provided as templates suitable for modification to meet the specific needs of the facility, practice, or specialty for its patient population.

 

 

AI Risks and Limitations

AI has several risks and limitations. Although there is progress in explainable AI, at times we still struggle to understand how the output provided by machine learning algorithms was created.44,48 The many layers associated with deep learning self-determine the criteria to reach its conclusion, and these criteria can continually evolve. The parameters of deep learning are not preprogrammed, and there are too many individual data points to be extrapolated or deconvoluted for evaluation at our current level of knowledge.26,51 These apparent lack of constraints cause concern for patient safety and suggest that greater validation and continued scrutiny of validity is required.8,48 Efforts are underway to create explainable AI programs to make their processes more transparent, but such clarification is limited presently.14,26,48,77

Another challenge of AI is determining the amount of training data required to function optimally. Also, if the output describes multiple variables or diagnoses, are each equally valid?113 Furthermore, many AI applications look for a specific process, such as cancer diagnoses on CXRs. However, how coexisting conditions like cardiomegaly, emphysema, pneumonia, etc, seen on CXRs will affect the diagnosis needs to be considered.51,52 Zech and colleagues provide the example that diagnoses for pneumothorax are frequently rendered on CXRs with chest tubes in place.51 They suggest that CNNs may develop a bias toward diagnosing pneumothorax when chest tubes are present. Many current studies approach an issue in isolation, a situation not realistic in real-world clinical practice.26

Most studies on AI have been retrospective, and frequently data used to train the program are preselected.13,26 The data are typically validated on available databases rather than actual patients in the clinical setting, limiting confidence in the validity of the AI output when applied to real-world situations. Currently, fewer than 12 prospective trials had been published comparing AI with traditional clinical care.13,114 Randomized prospective clinical trials are even fewer, with none currently reported from the United States.13,114 The results from several studies have been shown to diminish when repeated prospectively.114

The FDA has created a new category known as Software as a Medical Device and has a Digital Health Innovation Action Plan to regulate AI platforms. Still, the process of AI regulation is of necessity different from traditional approval processes and is continually evolving.8 The FDA approval process cannot account for the fact that the program’s parameters may continually evolve or adapt.2

Guidelines for investigating and reporting AI research with its unique attributes are being developed. Examples include the TRIPOD-ML statement and others.49,115 In September 2020, 2 publications addressed the paucity of gold-standard randomized clinical trials in clinical AI applications.116,117 The SPIRIT-AI statement expands on the original SPIRIT statement published in 2013 to guide minimal reporting standards for AI clinical trial protocols to promote transparency of design and methodology.116 Similarly, the CONSORT-AI extension, stemming from the original CONSORT statement in 1996, aims to ensure quality reporting of randomized controlled trials in AI.117

Another risk with AI is that while an individual physician making a mistake may adversely affect 1 patient, a single mistake in an AI algorithm could potentially affect thousands of patients.48 Also, AI programs developed for patient populations at a facility may not translate to another. Referred to as overfitting, this phenomenon relates to selection bias in training data sets.15,34,49,51,52 Studies have shown that programs that underrepresent certain group characteristics such as age, sex, or race may be less effective when applied to a population in which these characteristics have differing representations.8,48,49 This problem of underrepresentation has been demonstrated in programs interpreting pathology slides, radiographs, and skin lesions.15,32,51

Admittedly, most of these challenges are not specific to AI and existed in health care previously. Physicians make mistakes, treatments are sometimes used without adequate prospective studies, and medications are given without understanding their mechanism of action, much like AI-facilitated processes reach a conclusion that cannot be fully explained.48

Conclusions

The view that AI will dramatically impact health care in the coming years will likely prove true. However, much work is needed, especially because of the paucity of prospective clinical trials as has been historically required in medical research. Any concern that AI will replace HCPs seems unwarranted. Early studies suggest that even AI programs that appear to exceed human interpretation perform best when working in cooperation with and oversight from clinicians. AI’s greatest potential appears to be its ability to augment care from health professionals, improving efficiency and accuracy, and should be anticipated with enthusiasm as the field moves forward at an exponential rate.

Acknowledgments

The authors thank Makenna G. Thomas for proofreading and review of the manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital. This research has been approved by the James A. Haley Veteran’s Hospital Office of Communications and Media.

References

1. Bini SA. Artificial intelligence, machine learning, deep learning, and cognitive computing: what do these terms mean and how will they impact health care? J Arthroplasty. 2018;33(8):2358-2361. doi:10.1016/j.arth.2018.02.067

2. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit Med. 2020;3:118. doi:10.1038/s41746-020-00324-0

3. Viz. AI powered synchronized stroke care. Accessed September 15, 2021. https://www.viz.ai/ischemic-stroke

4. Buchanan M. The law of accelerating returns. Nat Phys. 2008;4(7):507. doi:10.1038/nphys1010

5. IBM Watson Health computes a pair of new solutions to improve healthcare data and security. Published September 10, 2015. Accessed October 21, 2020. https://www.techrepublic.com/article/ibm-watson-health-computes-a-pair-of-new-solutions-to-improve-healthcare-data-and-security

6. Borkowski AA, Kardani A, Mastorides SM, Thomas LB. Warfarin pharmacogenomics: recommendations with available patented clinical technologies. Recent Pat Biotechnol. 2014;8(2):110-115. doi:10.2174/1872208309666140904112003

7. Washington University in St. Louis. Warfarin dosing. Accessed September 15, 2021. http://www.warfarindosing.org/Source/Home.aspx

8. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med. 2019;25(1):30-36. doi:10.1038/s41591-018-0307-0

9. Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243. Published 2017 Jun 21. doi:10.1136/svn-2017-000101

10. Johnson KW, Torres Soto J, Glicksberg BS, et al. Artificial intelligence in cardiology. J Am Coll Cardiol. 2018;71(23):2668-2679. doi:10.1016/j.jacc.2018.03.521

11. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.

12. Cruz-Roa A, Gilmore H, Basavanhally A, et al. High-throughput adaptive sampling for whole-slide histopathology image analysis (HASHI) via convolutional neural networks: application to invasive breast cancer detection. PLoS One. 2018;13(5):e0196828. Published 2018 May 24. doi:10.1371/journal.pone.0196828

13. Nagendran M, Chen Y, Lovejoy CA, et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ. 2020;368:m689. Published 2020 Mar 25. doi:10.1136/bmj.m689

14. Shimizu H, Nakayama KI. Artificial intelligence in oncology. Cancer Sci. 2020;111(5):1452-1460. doi:10.1111/cas.14377

15. Talebi-Liasi F, Markowitz O. Is artificial intelligence going to replace dermatologists?. Cutis. 2020;105(1):28-31.

16. Valliani AA, Ranti D, Oermann EK. Deep learning and neurology: a systematic review. Neurol Ther. 2019;8(2):351-365. doi:10.1007/s40120-019-00153-8

17. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539

18. Graham S, Depp C, Lee EE, et al. Artificial intelligence for mental health and mental illnesses: an overview. Curr Psychiatry Rep. 2019;21(11):116. Published 2019 Nov 7. doi:10.1007/s11920-019-1094-0

19. Golas SB, Shibahara T, Agboola S, et al. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data. BMC Med Inform Decis Mak. 2018;18(1):44. Published 2018 Jun 22. doi:10.1186/s12911-018-0620-z

20. Mortazavi BJ, Downing NS, Bucholz EM, et al. Analysis of machine learning techniques for heart failure readmissions. Circ Cardiovasc Qual Outcomes. 2016;9(6):629-640. doi:10.1161/CIRCOUTCOMES.116.003039

21. Meyer-Bäse A, Morra L, Meyer-Bäse U, Pinker K. Current status and future perspectives of artificial intelligence in magnetic resonance breast imaging. Contrast Media Mol Imaging. 2020;2020:6805710. Published 2020 Aug 28. doi:10.1155/2020/6805710

22. Ehteshami Bejnordi B, Veta M, Johannes van Diest P, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210. doi:10.1001/jama.2017.14585

23. Borkowski AA, Viswanadhan NA, Thomas LB, Guzman RD, Deland LA, Mastorides SM. Using artificial intelligence for COVID-19 chest X-ray diagnosis. Fed Pract. 2020;37(9):398-404. doi:10.12788/fp.0045

24. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567. doi:10.1038/s41591-018-0177-5

25. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010

26. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health. 2019;1(6):e271-e297. doi:10.1016/S2589-7500(19)30123-2

27. Nagpal K, Foote D, Liu Y, et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer [published correction appears in NPJ Digit Med. 2019 Nov 19;2:113]. NPJ Digit Med. 2019;2:48. Published 2019 Jun 7. doi:10.1038/s41746-019-0112-2

28. Nam JG, Park S, Hwang EJ, et al. Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019;290(1):218-228. doi:10.1148/radiol.2018180237

29. Schwalbe N, Wahl B. Artificial intelligence and the future of global health. Lancet. 2020;395(10236):1579-1586. doi:10.1016/S0140-6736(20)30226-9

30. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT [published correction appears in Radiology. 2021 Apr;299(1):E225]. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491

31. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905

32. Serag A, Ion-Margineanu A, Qureshi H, et al. Translational AI and deep learning in diagnostic pathology. Front Med (Lausanne). 2019;6:185. Published 2019 Oct 1. doi:10.3389/fmed.2019.00185

<--pagebreak-->

33. Wang D, Khosla A, Gargeya R, Irshad H, Beck AH. Deep learning for identifying metastatic breast cancer. ArXiv. 2016 June 18:arXiv:1606.05718v1. Published online June 18, 2016. Accessed September 15, 2021. http://arxiv.org/abs/1606.05718

34. Alabdulkareem A. Artificial intelligence and dermatologists: friends or foes? J Dermatology Dermatol Surg. 2019;23(2):57-60. doi:10.4103/jdds.jdds_19_19

35. Mollalo A, Mao L, Rashidi P, Glass GE. A GIS-based artificial neural network model for spatial distribution of tuberculosis across the continental United States. Int J Environ Res Public Health. 2019;16(1):157. Published 2019 Jan 8. doi:10.3390/ijerph16010157

36. Haddawy P, Hasan AHMI, Kasantikul R, et al. Spatiotemporal Bayesian networks for malaria prediction. Artif Intell Med. 2018;84:127-138. doi:10.1016/j.artmed.2017.12.002

37. Laureano-Rosario AE, Duncan AP, Mendez-Lazaro PA, et al. Application of artificial neural networks for dengue fever outbreak predictions in the northwest coast of Yucatan, Mexico and San Juan, Puerto Rico. Trop Med Infect Dis. 2018;3(1):5. Published 2018 Jan 5. doi:10.3390/tropicalmed3010005

38. Buczak AL, Koshute PT, Babin SM, Feighner BH, Lewis SH. A data-driven epidemiological prediction method for dengue outbreaks using local and remote sensing data. BMC Med Inform Decis Mak. 2012;12:124. Published 2012 Nov 5. doi:10.1186/1472-6947-12-124

39. Scavuzzo JM, Trucco F, Espinosa M, et al. Modeling dengue vector population using remotely sensed data and machine learning. Acta Trop. 2018;185:167-175. doi:10.1016/j.actatropica.2018.05.003

40. Xue H, Bai Y, Hu H, Liang H. Influenza activity surveillance based on multiple regression model and artificial neural network. IEEE Access. 2018;6:563-575. doi:10.1109/ACCESS.2017.2771798

41. Jiang D, Hao M, Ding F, Fu J, Li M. Mapping the transmission risk of Zika virus using machine learning models. Acta Trop. 2018;185:391-399. doi:10.1016/j.actatropica.2018.06.021

42. Bragazzi NL, Dai H, Damiani G, Behzadifar M, Martini M, Wu J. How big data and artificial intelligence can help better manage the COVID-19 pandemic. Int J Environ Res Public Health. 2020;17(9):3176. Published 2020 May 2. doi:10.3390/ijerph17093176

43. Lake IR, Colón-González FJ, Barker GC, Morbey RA, Smith GE, Elliot AJ. Machine learning to refine decision making within a syndromic surveillance service. BMC Public Health. 2019;19(1):559. Published 2019 May 14. doi:10.1186/s12889-019-6916-9

44. Khan OF, Bebb G, Alimohamed NA. Artificial intelligence in medicine: what oncologists need to know about its potential-and its limitations. Oncol Exch. 2017;16(4):8-13. Accessed September 1, 2021. http://www.oncologyex.com/pdf/vol16_no4/feature_khan-ai.pdf

45. Badano LP, Keller DM, Muraru D, Torlasco C, Parati G. Artificial intelligence and cardiovascular imaging: A win-win combination. Anatol J Cardiol. 2020;24(4):214-223. doi:10.14744/AnatolJCardiol.2020.94491

46. Murdoch TB, Detsky AS. The inevitable application of big data to health care. JAMA. 2013;309(13):1351-1352. doi:10.1001/jama.2013.393

47. Greatbatch O, Garrett A, Snape K. The impact of artificial intelligence on the current and future practice of clinical cancer genomics. Genet Res (Camb). 2019;101:e9. Published 2019 Oct 31. doi:10.1017/S0016672319000089

48. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. doi:10.1038/s41591-018-0300-7

49. Vollmer S, Mateen BA, Bohner G, et al. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness [published correction appears in BMJ. 2020 Apr 1;369:m1312]. BMJ. 2020;368:l6927. Published 2020 Mar 20. doi:10.1136/bmj.l6927

50. Lindsey R, Daluiski A, Chopra S, et al. Deep neural network improves fracture detection by clinicians. Proc Natl Acad Sci U S A. 2018;115(45):11591-11596. doi:10.1073/pnas.1806905115

51. Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 2018;15(11):e1002683. doi:10.1371/journal.pmed.1002683

52. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582. doi:10.1148/radiol.2017162326

53. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. ArXiv. 2020 Feb 26:arXiv:2002.11379v2. Revised March 11, 2020. Accessed September 15, 2021. http://arxiv.org/abs/2002.11379

54. Salim M, Wåhlin E, Dembrower K, et al. External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol. 2020;6(10):1581-1588. doi:10.1001/jamaoncol.2020.3321

55. Arbabshirani MR, Fornwalt BK, Mongelluzzo GJ, et al. Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. NPJ Digit Med. 2018;1:9. doi:10.1038/s41746-017-0015-z

56. Sheth D, Giger ML. Artificial intelligence in the interpretation of breast cancer on MRI. J Magn Reson Imaging. 2020;51(5):1310-1324. doi:10.1002/jmri.26878

57. McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577(7788):89-94. doi:10.1038/s41586-019-1799-6

58. Booth AL, Abels E, McCaffrey P. Development of a prognostic model for mortality in COVID-19 infection using machine learning. Mod Pathol. 2021;34(3):522-531. doi:10.1038/s41379-020-00700-x

59. Xu B, Kocyigit D, Grimm R, Griffin BP, Cheng F. Applications of artificial intelligence in multimodality cardiovascular imaging: a state-of-the-art review. Prog Cardiovasc Dis. 2020;63(3):367-376. doi:10.1016/j.pcad.2020.03.003

60. Dey D, Slomka PJ, Leeson P, et al. Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. J Am Coll Cardiol. 2019;73(11):1317-1335. doi:10.1016/j.jacc.2018.12.054

61. Carewell Health. AI powered ECG diagnosis solutions. Accessed November 2, 2020. https://www.carewellhealth.com/products_aiecg.html

62. Strodthoff N, Strodthoff C. Detecting and interpreting myocardial infarction using fully convolutional neural networks. Physiol Meas. 2019;40(1):015001. doi:10.1088/1361-6579/aaf34d

63. Hannun AY, Rajpurkar P, Haghpanahi M, et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25(1):65-69. doi:10.1038/s41591-018-0268-3

64. Kwon JM, Jeon KH, Kim HM, et al. Comparing the performance of artificial intelligence and conventional diagnosis criteria for detecting left ventricular hypertrophy using electrocardiography. Europace. 2020;22(3):412-419. doi:10.1093/europace/euz324

<--pagebreak-->

65. Eko. FDA clears Eko’s AFib and heart murmur detection algorithms, making it the first AI-powered stethoscope to screen for serious heart conditions [press release]. Published January 28, 2020. Accessed September 15, 2021. https://www.businesswire.com/news/home/20200128005232/en/FDA-Clears-Eko’s-AFib-and-Heart-Murmur-Detection-Algorithms-Making-It-the-First-AI-Powered-Stethoscope-to-Screen-for-Serious-Heart-Conditions

66. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450. doi:10.1038/srep46450

67. Acs B, Rantalainen M, Hartman J. Artificial intelligence as the next step towards precision pathology. J Intern Med. 2020;288(1):62-81. doi:10.1111/joim.13030

68. Mobadersany P, Yousefi S, Amgad M, et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A. 2018;115(13):E2970-E2979. doi:10.1073/pnas.1717139115

69. Wang X, Janowczyk A, Zhou Y, et al. Prediction of recurrence in early stage non-small cell lung cancer using computer extracted nuclear features from digital H&E images. Sci Rep. 2017;7:13543. doi:10.1038/s41598-017-13773-7

70. Kulkarni PM, Robinson EJ, Pradhan JS, et al. Deep learning based on standard H&E images of primary melanoma tumors identifies patients at risk for visceral recurrence and death. Clin Cancer Res. 2020;26(5):1126-1134. doi:10.1158/1078-0432.CCR-19-1495

71. Du XL, Li WB, Hu BJ. Application of artificial intelligence in ophthalmology. Int J Ophthalmol. 2018;11(9):1555-1561. doi:10.18240/ijo.2018.09.21

72. Gunasekeran DV, Wong TY. Artificial intelligence in ophthalmology in 2020: a technology on the cusp for translation and implementation. Asia Pac J Ophthalmol (Phila). 2020;9(2):61-66. doi:10.1097/01.APO.0000656984.56467.2c

73. Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167-175. doi:10.1136/bjophthalmol-2018-313173

74. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2410. doi:10.1001/jama.2016.17216

75. US Food and Drug Administration. FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems [press release]. Published April 11, 2018. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-permits-marketing-artificial-intelligence-based-device-detect-certain-diabetes-related-eye

76. Long E, Chen J, Wu X, et al. Artificial intelligence manages congenital cataract with individualized prediction and telehealth computing. NPJ Digit Med. 2020;3:112. doi:10.1038/s41746-020-00319-x

77. De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med. 2018;24(9):1342-1350. doi:10.1038/s41591-018-0107-6

78. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056

79. Brinker TJ, Hekler A, Enk AH, et al. Deep neural networks are superior to dermatologists in melanoma image classification. Eur J Cancer. 2019;119:11-17. doi:10.1016/j.ejca.2019.05.023

80. Brinker TJ, Hekler A, Enk AH, et al. A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task. Eur J Cancer. 2019;111:148-154. doi:10.1016/j.ejca.2019.02.005

81. Haenssle HA, Fink C, Schneiderbauer R, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018;29(8):1836-1842. doi:10.1093/annonc/mdy166

82. Li CX, Shen CB, Xue K, et al. Artificial intelligence in dermatology: past, present, and future. Chin Med J (Engl). 2019;132(17):2017-2020. doi:10.1097/CM9.0000000000000372

83. Tschandl P, Codella N, Akay BN, et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20(7):938-947. doi:10.1016/S1470-2045(19)30333-X

84. Han SS, Park I, Eun Chang SE, et al. Augmented intelligence dermatology: deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders. J Invest Dermatol. 2020;140(9):1753-1761. doi:10.1016/j.jid.2020.01.019

85. Freeman K, Dinnes J, Chuchu N, et al. Algorithm based smartphone apps to assess risk of skin cancer in adults: systematic review of diagnostic accuracy studies [published correction appears in BMJ. 2020 Feb 25;368:m645]. BMJ. 2020;368:m127. Published 2020 Feb 10. doi:10.1136/bmj.m127

86. Chen YC, Ke WC, Chiu HW. Risk classification of cancer survival using ANN with gene expression data from multiple laboratories. Comput Biol Med. 2014;48:1-7. doi:10.1016/j.compbiomed.2014.02.006

87. Kim W, Kim KS, Lee JE, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15(2):230-238. doi:10.4048/jbc.2012.15.2.230

88. Merath K, Hyer JM, Mehta R, et al. Use of machine learning for prediction of patient risk of postoperative complications after liver, pancreatic, and colorectal surgery. J Gastrointest Surg. 2020;24(8):1843-1851. doi:10.1007/s11605-019-04338-2

89. Santos-García G, Varela G, Novoa N, Jiménez MF. Prediction of postoperative morbidity after lung resection using an artificial neural network ensemble. Artif Intell Med. 2004;30(1):61-69. doi:10.1016/S0933-3657(03)00059-9

90. Ibragimov B, Xing L. Segmentation of organs-at-risks in head and neck CT images using convolutional neural networks. Med Phys. 2017;44(2):547-557. doi:10.1002/mp.12045

91. Lou B, Doken S, Zhuang T, et al. An image-based deep learning framework for individualizing radiotherapy dose. Lancet Digit Health. 2019;1(3):e136-e147. doi:10.1016/S2589-7500(19)30058-5

92. Xu J, Yang P, Xue S, et al. Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives. Hum Genet. 2019;138(2):109-124. doi:10.1007/s00439-019-01970-5

93. Patel NM, Michelini VV, Snell JM, et al. Enhancing next‐generation sequencing‐guided cancer care through cognitive computing. Oncologist. 2018;23(2):179-185. doi:10.1634/theoncologist.2017-0170

94. Le Berre C, Sandborn WJ, Aridhi S, et al. Application of artificial intelligence to gastroenterology and hepatology. Gastroenterology. 2020;158(1):76-94.e2. doi:10.1053/j.gastro.2019.08.058

95. Yang YJ, Bang CS. Application of artificial intelligence in gastroenterology. World J Gastroenterol. 2019;25(14):1666-1683. doi:10.3748/wjg.v25.i14.1666

96. Wang P, Berzin TM, Glissen Brown JR, et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019;68(10):1813-1819. doi:10.1136/gutjnl-2018-317500

<--pagebreak-->

97. Gupta R, Krishnam SP, Schaefer PW, Lev MH, Gonzalez RG. An East Coast perspective on artificial intelligence and machine learning: part 2: ischemic stroke imaging and triage. Neuroimaging Clin N Am. 2020;30(4):467-478. doi:10.1016/j.nic.2020.08.002

98. Beli M, Bobi V, Badža M, Šolaja N, Duri-Jovii M, Kosti VS. Artificial intelligence for assisting diagnostics and assessment of Parkinson’s disease—a review. Clin Neurol Neurosurg. 2019;184:105442. doi:10.1016/j.clineuro.2019.105442

99. An S, Kang C, Lee HW. Artificial intelligence and computational approaches for epilepsy. J Epilepsy Res. 2020;10(1):8-17. doi:10.14581/jer.20003

100. Pavel AM, Rennie JM, de Vries LS, et al. A machine-learning algorithm for neonatal seizure recognition: a multicentre, randomised, controlled trial. Lancet Child Adolesc Health. 2020;4(10):740-749. doi:10.1016/S2352-4642(20)30239-X

101. Afzal HMR, Luo S, Ramadan S, Lechner-Scott J. The emerging role of artificial intelligence in multiple sclerosis imaging [published online ahead of print, 2020 Oct 28]. Mult Scler. 2020;1352458520966298. doi:10.1177/1352458520966298

102. Bouton CE. Restoring movement in paralysis with a bioelectronic neural bypass approach: current state and future directions. Cold Spring Harb Perspect Med. 2019;9(11):a034306. doi:10.1101/cshperspect.a034306

103. Durstewitz D, Koppe G, Meyer-Lindenberg A. Deep neural networks in psychiatry. Mol Psychiatry. 2019;24(11):1583-1598. doi:10.1038/s41380-019-0365-9

104. Fonseka TM, Bhat V, Kennedy SH. The utility of artificial intelligence in suicide risk prediction and the management of suicidal behaviors. Aust N Z J Psychiatry. 2019;53(10):954-964. doi:10.1177/0004867419864428

105. Kessler RC, Hwang I, Hoffmire CA, et al. Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans Health Administration. Int J Methods Psychiatr Res. 2017;26(3):e1575. doi:10.1002/mpr.1575

106. Kessler RC, Bauer MS, Bishop TM, et al. Using administrative data to predict suicide after psychiatric hospitalization in the Veterans Health Administration System. Front Psychiatry. 2020;11:390. doi:10.3389/fpsyt.2020.00390

107. Kessler RC, van Loo HM, Wardenaar KJ, et al. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry. 2016;21(10):1366-1371. doi:10.1038/mp.2015.198

108. Horng S, Sontag DA, Halpern Y, Jernite Y, Shapiro NI, Nathanson LA. Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning. PLoS One. 2017;12(4):e0174708. doi:10.1371/journal.pone.0174708

109. Soffer S, Klang E, Barash Y, Grossman E, Zimlichman E. Predicting in-hospital mortality at admission to the medical ward: a big-data machine learning model. Am J Med. 2021;134(2):227-234.e4. doi:10.1016/j.amjmed.2020.07.014

110. Labovitz DL, Shafner L, Reyes Gil M, Virmani D, Hanina A. Using artificial intelligence to reduce the risk of nonadherence in patients on anticoagulation therapy. Stroke. 2017;48(5):1416-1419. doi:10.1161/STROKEAHA.116.016281

111. Forlenza GP. Use of artificial intelligence to improve diabetes outcomes in patients using multiple daily injections therapy. Diabetes Technol Ther. 2019;21(S2):S24-S28. doi:10.1089/dia.2019.0077

112. Poser CM. CT scan and the practice of neurology. Arch Neurol. 1977;34(2):132. doi:10.1001/archneur.1977.00500140086023

113. Angus DC. Randomized clinical trials of artificial intelligence. JAMA. 2020;323(11):1043-1045. doi:10.1001/jama.2020.1039

114. Topol EJ. Welcoming new guidelines for AI clinical research. Nat Med. 2020;26(9):1318-1320. doi:10.1038/s41591-020-1042-x

115. Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. Lancet. 2019;393(10181):1577-1579. doi:10.1016/S0140-6736(19)30037-6

116. Cruz Rivera S, Liu X, Chan AW, et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat Med. 2020;26(9):1351-1363. doi:10.1038/s41591-020-1037-7

117. Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK; SPIRIT-AI and CONSORT-AI Working Group. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med. 2020;26(9):1364-1374. doi:10.1038/s41591-020-1034-x

118. McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5(4):115-133. doi:10.1007/BF02478259

119. Samuel AL. Some studies in machine learning using the game of Checkers. IBM J Res Dev. 1959;3(3):535-554. Accessed September 15, 2021. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.2254

120. Sonoda M, Takano M, Miyahara J, Kato H. Computed radiography utilizing scanning laser stimulated luminescence. Radiology. 1983;148(3):833-838. doi:10.1148/radiology.148.3.6878707

121. Dechter R. Learning while searching in constraint-satisfaction-problems. AAAI’86: proceedings of the fifth AAAI national conference on artificial intelligence. Published 1986. Accessed September 15, 2021. https://www.aaai.org/Papers/AAAI/1986/AAAI86-029.pdf

122. Le Cun Y, Jackel LD, Boser B, et al. Handwritten digit recognition: applications of neural network chips and automatic learning. IEEE Commun Mag. 1989;27(11):41-46. doi:10.1109/35.41400

123. US Food and Drug Administration. FDA allows marketing of first whole slide imaging system for digital pathology [press release]. Published April 12, 2017. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-allows-marketing-first-whole-slide-imaging-system-digital-pathology

Author and Disclosure Information

L. Brannon Thomas is Chief of the Microbiology Laboratory, Stephen Mastorides is Chief of Pathology, Narayan Viswanadhan is Assistant Chief of Radiology, Colleen Jakey is Chief of Staff, and Andrew Borkowski is Chief of the Molecular Diagnostics Laboratory, all at James A. Haley Veterans’ Hospital in Tampa, Florida. Andrew Borkowski and Stephen Mastorides are Professors, Colleen Jakey is an Associate Professor, and L. Brannon Thomas is an Associate Professor, all at the University of South Florida, Morsani College of Medicine in Tampa.
Correspondence: L. Brannon Thomas (lamar.thomas@va.gov)

Author disclosures
The authors report no actual or potential conflicts of interest with regard to this article.

Disclaimer
The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Issue
Federal Practitioner - 38(11)a
Publications
Topics
Page Number
527-538
Sections
Author and Disclosure Information

L. Brannon Thomas is Chief of the Microbiology Laboratory, Stephen Mastorides is Chief of Pathology, Narayan Viswanadhan is Assistant Chief of Radiology, Colleen Jakey is Chief of Staff, and Andrew Borkowski is Chief of the Molecular Diagnostics Laboratory, all at James A. Haley Veterans’ Hospital in Tampa, Florida. Andrew Borkowski and Stephen Mastorides are Professors, Colleen Jakey is an Associate Professor, and L. Brannon Thomas is an Associate Professor, all at the University of South Florida, Morsani College of Medicine in Tampa.
Correspondence: L. Brannon Thomas (lamar.thomas@va.gov)

Author disclosures
The authors report no actual or potential conflicts of interest with regard to this article.

Disclaimer
The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Author and Disclosure Information

L. Brannon Thomas is Chief of the Microbiology Laboratory, Stephen Mastorides is Chief of Pathology, Narayan Viswanadhan is Assistant Chief of Radiology, Colleen Jakey is Chief of Staff, and Andrew Borkowski is Chief of the Molecular Diagnostics Laboratory, all at James A. Haley Veterans’ Hospital in Tampa, Florida. Andrew Borkowski and Stephen Mastorides are Professors, Colleen Jakey is an Associate Professor, and L. Brannon Thomas is an Associate Professor, all at the University of South Florida, Morsani College of Medicine in Tampa.
Correspondence: L. Brannon Thomas (lamar.thomas@va.gov)

Author disclosures
The authors report no actual or potential conflicts of interest with regard to this article.

Disclaimer
The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Artificial Intelligence (AI) was first described in 1956 and refers to machines having the ability to learn as they receive and process information, resulting in the ability to “think” like humans.1 AI’s impact in medicine is increasing; currently, at least 29 AI medical devices and algorithms are approved by the US Food and Drug Administration (FDA) in a variety of areas, including radiograph interpretation, managing glucose levels in patients with diabetes mellitus, analyzing electrocardiograms (ECGs), and diagnosing sleep disorders among others.2 Significantly, in 2020, the Centers for Medicare and Medicaid Services (CMS) announced the first reimbursement to hospitals for an AI platform, a model for early detection of strokes.3 AI is rapidly becoming an integral part of health care, and its role will only increase in the future (Table).

fdp03811527_t.png

As knowledge in medicine is expanding exponentially, AI has great potential to assist with handling complex patient care data. The concept of exponential growth is not a natural one. As Bini described, with exponential growth the volume of knowledge amassed over the past 10 years will now occur in perhaps only 1 year.1 Likewise, equivalent advances over the past year may take just a few months. This phenomenon is partly due to the law of accelerating returns, which states that advances feed on themselves, continually increasing the rate of further advances.4 The volume of medical data doubles every 2 to 5 years.5 Fortunately, the field of AI is growing exponentially as well and can help health care practitioners (HCPs) keep pace, allowing the continued delivery of effective health care.

In this report, we review common terminology, principles, and general applications of AI, followed by current and potential applications of AI for selected medical specialties. Finally, we discuss AI’s future in health care, along with potential risks and pitfalls.

 

AI Overview

AI refers to machine programs that can “learn” or think based on past experiences. This functionality contrasts with simple rules-based programming available to health care for years. An example of rules-based programming is the warfarindosing.org website developed by Barnes-Jewish Hospital at Washington University Medical Center, which guides initial warfarin dosing.6,7 The prescriber inputs detailed patient information, including age, sex, height, weight, tobacco history, medications, laboratory results, and genotype if available. The application then calculates recommended warfarin dosing regimens to avoid over- or underanticoagulation. While the dosing algorithm may be complex, it depends entirely on preprogrammed rules. The program does not learn to reach its conclusions and recommendations from patient data.

In contrast, one of the most common subsets of AI is machine learning (ML). ML describes a program that “learns from experience and improves its performance as it learns.”1 With ML, the computer is initially provided with a training data set—data with known outcomes or labels. Because the initial data are input from known samples, this type of AI is known as supervised learning.8-10 As an example, we recently reported using ML to diagnose various types of cancer from pathology slides.11 In one experiment, we captured images of colon adenocarcinoma and normal colon (these 2 groups represent the training data set). Unlike traditional programming, we did not define characteristics that would differentiate colon cancer from normal; rather, the machine learned these characteristics independently by assessing the labeled images provided. A second data set (the validation data set) was used to evaluate the program and fine-tune the ML training model’s parameters. Finally, the program was presented with new images of cancer and normal cases for final assessment of accuracy (test data set). Our program learned to recognize differences from the images provided and was able to differentiate normal and cancer images with > 95% accuracy.

Advances in computer processing have allowed for the development of artificial neural networks (ANNs). While there are several types of ANNs, the most common types used for image classification and segmentation are known as convolutional neural networks (CNNs).9,12-14 The programs are designed to work similar to the human brain, specifically the visual cortex.15,16 As data are acquired, they are processed by various layers in the program. Much like neurons in the brain, one layer decides whether to advance information to the next.13,14 CNNs can be many layers deep, leading to the term deep learning: “computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction.”1,13,17

ANNs can process larger volumes of data. This advance has led to the development of unstructured or unsupervised learning. With this type of learning, imputing defined features (ie, predetermined answers) of the training data set described above is no longer required.1,8,10,14 The advantage of unsupervised learning is that the program can be presented raw data and extract meaningful interpretation without human input, often with less bias than may exist with supervised learning.1,18 If shown enough data, the program can extract relevant features to make conclusions independently without predefined definitions, potentially uncovering markers not previously known. For example, several studies have used unsupervised learning to search patient data to assess readmission risks of patients with congestive heart failure.10,19,20 AI compiled features independently and not previously defined, predicting patients at greater risk for readmission superior to traditional methods.

fdp03811527_f.png


A more detailed description of the various terminologies and techniques of AI is beyond the scope of this review.9,10,17,21 However, in this basic overview, we describe 4 general areas that AI impacts health care (Figure).

 

 

Health Care Applications

Image analysis has seen the most AI health care applications.8,15 AI has shown potential in interpreting many types of medical images, including pathology slides, radiographs of various types, retina and other eye scans, and photographs of skin lesions. Many studies have demonstrated that AI can interpret these images as accurately as or even better than experienced clinicians.9,13,22-29 Studies have suggested AI interpretation of radiographs may better distinguish patients infected with COVID-19 from other causes of pneumonia, and AI interpretation of pathology slides may detect specific genetic mutations not previously identified without additional molecular tests.11,14,23,24,30-32

The second area in which AI can impact health care is improving workflow and efficiency. AI has improved surgery scheduling, saving significant revenue, and decreased patient wait times for appointments.1 AI can screen and triage radiographs, allowing attention to be directed to critical patients. This use would be valuable in many busy clinical settings, such as the recent COVID-19 pandemic.8,23 Similarly, AI can screen retina images to prioritize urgent conditions.25 AI has improved pathologists’ efficiency when used to detect breast metastases.33 Finally, AI may reduce medical errors, thereby ensuring patient safety.8,9,34

A third health care benefit of AI is in public health and epidemiology. AI can assist with clinical decision-making and diagnoses in low-income countries and areas with limited health care resources and personnel.25,29 AI can improve identification of infectious outbreaks, such as tuberculosis, malaria, dengue fever, and influenza.29,35-40 AI has been used to predict transmission patterns of the Zika virus and the current COVID-19 pandemic.41,42 Applications can stratify the risk of outbreaks based on multiple factors, including age, income, race, atypical geographic clusters, and seasonal factors like rainfall and temperature.35,36,38,43 AI has been used to assess morbidity and mortality, such as predicting disease severity with malaria and identifying treatment failures in tuberculosis.29

Finally, AI can dramatically impact health care due to processing large data sets or disconnected volumes of patient information—so-called big data.44-46 An example is the widespread use of electronic health records (EHRs) such as the Computerized Patient Record System used in Veteran Affairs medical centers (VAMCs). Much of patient information exists in written text: HCP notes, laboratory and radiology reports, medication records, etc. Natural language processing (NLP) allows platforms to sort through extensive volumes of data on complex patients at rates much faster than human capability, which has great potential to assist with diagnosis and treatment decisions.9

Medical literature is being produced at rates that exceed our ability to digest. More than 200,000 cancer-related articles were published in 2019 alone.14 NLP capabilities of AI have the potential to rapidly sort through this extensive medical literature and relate specific verbiage in patient records guiding therapy.46 IBM Watson, a supercomputer based on ML and NLP, demonstrates this concept with many potential applications, only some of which relate to health care.1,9 Watson has an oncology component to assimilate multiple aspects of patient care, including clinical notes, pathology results, radiograph findings, staging, and a tumor’s genetic profile. It coordinates these inputs from the EHR and mines medical literature and research databases to recommend treatment options.1,46 AI can assess and compile far greater patient data and therapeutic options than would be feasible by individual clinicians, thus providing customized patient care.47 Watson has partnered with numerous medical centers, including MD Anderson Cancer Center and Memorial Sloan Kettering Cancer Center, with variable success.44,47-49 While the full potential of Watson appears not yet realized, these AI-driven approaches will likely play an important role in leveraging the hidden value in the expanding volume of health care information.

Medical Specialty Applications

Radiology

Currently > 70% of FDA-approved AI medical devices are in the field of radiology.2 Most radiology departments have used AI-friendly digital imaging for years, such as the picture archiving and communication systems used by numerous health care systems, including VAMCs.2,15 Gray-scale images common in radiology lend themselves to standardization, although AI is not limited to black-and- white image interpretation.15

An abundance of literature describes plain radiograph interpretation using AI. One FDA-approved platform improved X-ray diagnosis of wrist fractures when used by emergency medicine clinicians.2,50 AI has been applied to chest X-ray (CXR) interpretation of many conditions, including pneumonia, tuberculosis, malignant lung lesions, and COVID-19.23,25,28,44,51-53 For example, Nam and colleagues suggested AI is better at diagnosing malignant pulmonary nodules from CXRs than are trained radiologists.28

In addition to plain radiographs, AI has been applied to many other imaging technologies, including ultrasounds, positron emission tomography, mammograms, computed tomography (CT), and magnetic resonance imaging (MRI).15,26,44,48,54-56 A large study demonstrated that ML platforms significantly reduced the time to diagnose intracranial hemorrhages on CT and identified subtle hemorrhages missed by radiologists.55 Other studies have claimed that AI programs may be better than radiologists in detecting cancer in screening mammograms, and 3 FDA-approved devices focus on mammogram interpretation.2,15,54,57 There is also great interest in MRI applications to detect and predict prognosis for breast cancer based on imaging findings.21,56

Aside from providing accurate diagnoses, other studies focus on AI radiograph interpretation to assist with patient screening, triage, improving time to final diagnosis, providing a rapid “second opinion,” and even monitoring disease progression and offering insights into prognosis.8,21,23,52,55,56,58 These features help in busy urban centers but may play an even greater role in areas with limited access to health care or trained specialists such as radiologists.52

 

 

Cardiology

Cardiology has the second highest number of FDA-approved AI applications.2 Many cardiology AI platforms involve image analysis, as described in several recent reviews.45,59,60 AI has been applied to echocardiography to measure ejection fractions, detect valvular disease, and assess heart failure from hypertrophic and restrictive cardiomyopathy and amyloidosis.45,48,59 Applications for cardiac CT scans and CT angiography have successfully quantified both calcified and noncalcified coronary artery plaques and lumen assessments, assessed myocardial perfusion, and performed coronary artery calcium scoring.45,59,60 Likewise, AI applications for cardiac MRI have been used to quantitate ejection fraction, large vessel flow assessment, and cardiac scar burden.45,59

For years ECG devices have provided interpretation with limited accuracy using preprogrammed parameters.48 However, the application of AI allows ECG interpretation on par with trained cardiologists. Numerous such AI applications exist, and 2 FDA-approved devices perform ECG interpretation.2,61-64 One of these devices incorporates an AI-powered stethoscope to detect atrial fibrillation and heart murmurs.65

Pathology

The advancement of whole slide imaging, wherein entire slides can be scanned and digitized at high speed and resolution, creates great potential for AI applications in pathology.12,24,32,33,66 A landmark study demonstrating the potential of AI for assessing whole slide imaging examined sentinel lymph node metastases in patients with breast cancer.22 Multiple algorithms in the study demonstrated that AI was equivalent or better than pathologists in detecting metastases, especially when the pathologists were time-constrained consistent with a normal working environment. Significantly, the most accurate and efficient diagnoses were achieved when the pathologist and AI interpretations were used together.22,33

AI has shown promise in diagnosing many other entities, including cancers of the prostate (including Gleason scoring), lung, colon, breast, and skin.11,12,24,27,32,67 In addition, AI has shown great potential in scoring biomarkers important for prognosis and treatment, such as immunohistochemistry (IHC) labeling of Ki-67 and PD-L1.32 Pathologists can have difficulty classifying certain tumors or determining the site of origin for metastases, often having to rely on IHC with limited success. The unique features of image analysis with AI have the potential to assist in classifying difficult tumors and identifying sites of origin for metastatic disease based on morphology alone.11

Oncology depends heavily on molecular pathology testing to dictate treatment options and determine prognosis. Preliminary studies suggest that AI interpretation alone has the potential to delineate whether certain molecular mutations are present in tumors from various sites.11,14,24,32 One study combined histology and genomic results for AI interpretation that improved prognostic predictions.68 In addition, AI analysis may have potential in predicting tumor recurrence or prognosis based on cellular features, as demonstrated for lung cancer and melanoma.67,69,70

Ophthalmology

AI applications for ophthalmology have focused on diabetic retinopathy, age-related macular degeneration, glaucoma, retinopathy of prematurity, age-related and congenital cataracts, and retinal vein occlusion.71-73 Diabetic retinopathy is a leading cause of blindness and has been studied by numerous platforms with good success, most having used color fundus photography.71,72 One study showed AI could diagnose diabetic retinopathy and diabetic macular edema with specificities similar to ophthalmologists.74 In 2018, the FDA approved the AI platform IDx-DR. This diagnostic system classifies retinal images and recommends referral for patients determined to have “more than mild diabetic retinopathy” and reexamination within a year for other patients.8,75 Significantly, the platform recommendations do not require confirmation by a clinician.8

AI has been applied to other modalities in ophthalmology such as optical coherence tomography (OCT) to diagnose retinal disease and to predict appropriate management of congenital cataracts.25,73,76 For example, an AI application using OCT has been demonstrated to match or exceed the accuracy of retinal experts in diagnosing and triaging patients with a variety of retinal pathologies, including patients needing urgent referrals.77

Dermatology

Multiple studies demonstrate AI performs at least equal to experienced dermatologists in differentiating selected skin lesions.78-81 For example, Esteva and colleagues demonstrated AI could differentiate keratinocyte carcinomas from benign seborrheic keratoses and malignant melanomas from benign nevi with accuracy equal to 21 board-certified dermatologists.78

 

 

AI is applicable to various imaging procedures common to dermatology, such as dermoscopy, very high-frequency ultrasound, and reflectance confocal microscopy.82 Several studies have demonstrated that AI interpretation compared favorably to dermatologists evaluating dermoscopy to assess melanocytic lesions.78-81,83

A limitation in these studies is that they differentiate only a few diagnoses.82 Furthermore, dermatologists have sensory input such as touch and visual examination under various conditions, something AI has yet to replicate.15,34,84 Also, most AI devices use no or limited clinical information.81 Dermatologists can recognize rarer conditions for which AI models may have had limited or no training.34 Nevertheless, a recent study assessed AI for the diagnosis of 134 separate skin disorders with promising results, including providing diagnoses with accuracy comparable to that of dermatologists and providing accurate treatment strategies.84 As Topol points out, most skin lesions are diagnosed in the primary care setting where AI can have a greater impact when used in conjunction with the clinical impression, especially where specialists are in limited supply.48,78

Finally, dermatology lends itself to using portable or smartphone applications (apps) wherein the user can photograph a lesion for analysis by AI algorithms to assess the need for further evaluation or make treatment recommendations.34,84,85 Although results from currently available apps are not encouraging, they may play a greater role as the technology advances.34,85

 

Oncology

Applications of AI in oncology include predicting prognosis for patients with cancer based on histologic and/or genetic information.14,68,86 Programs can predict the risk of complications before and recurrence risks after surgery for malignancies.44,87-89 AI can also assist in treatment planning and predict treatment failure with radiation therapy.90,91

AI has great potential in processing the large volumes of patient data in cancer genomics. Next-generation sequencing has allowed for the identification of millions of DNA sequences in a single tumor to detect genetic anomalies.92 Thousands of mutations can be found in individual tumor samples, and processing this information and determining its significance can be beyond human capability.14 We know little about the effects of various mutation combinations, and most tumors have a heterogeneous molecular profile among different cell populations.14,93 The presence or absence of various mutations can have diagnostic, prognostic, and therapeutic implications.93 AI has great potential to sort through these complex data and identify actionable findings.

More than 200,000 cancer-related articles were published in 2019, and publications in the field of cancer genomics are increasing exponentially.14,92,93 Patel and colleagues assessed the utility of IBM Watson for Genomics against results from a molecular tumor board.93 Watson for Genomics identified potentially significant mutations not identified by the tumor board in 32% of patients. Most mutations were related to new clinical trials not yet added to the tumor board watch list, demonstrating the role AI will have in processing the large volume of genetic data required to deliver personalized medicine moving forward.

Gastroenterology

AI has shown promise in predicting risk or outcomes based on clinical parameters in various common gastroenterology problems, including gastric reflux, acute pancreatitis, gastrointestinal bleeding, celiac disease, and inflammatory bowel disease.94,95 AI endoscopic analysis has demonstrated potential in assessing Barrett’s esophagus, gastric Helicobacter pylori infections, gastric atrophy, and gastric intestinal metaplasia.95 Applications have been used to assess esophageal, gastric, and colonic malignancies, including depth of invasion based on endoscopic images.95 Finally, studies have evaluated AI to assess small colon polyps during colonoscopy, including differentiating benign and premalignant polyps with success comparable to gastroenterologists.94,95 AI has been shown to increase the speed and accuracy of gastroenterologists in detecting small polyps during colonoscopy.48 In a prospective randomized study, colonoscopies performed using an AI device identified significantly more small adenomatous polyps than colonoscopies without AI.96

Neurology

It has been suggested that AI technologies are well suited for application in neurology due to the subtle presentation of many neurologic diseases.16 Viz LVO, the first CMS-approved AI reimbursement for the diagnosis of strokes, analyzes CTs to detect early ischemic strokes and alerts the medical team, thus shortening time to treatment.3,97 Many other AI platforms are in use or development that use CT and MRI for the early detection of strokes as well as for treatment and prognosis.9,97

AI technologies have been applied to neurodegenerative diseases, such as Alzheimer and Parkinson diseases.16,98 For example, several studies have evaluated patient movements in Parkinson disease for both early diagnosis and to assess response to treatment.98 These evaluations included assessment with both external cameras as well as wearable devices and smartphone apps.

 

 



AI has also been applied to seizure disorders, attempting to determine seizure type, localize the area of seizure onset, and address the challenges of identifying seizures in neonates.99,100 Other potential applications range from early detection and prognosis predictions for cases of multiple sclerosis to restoring movement in paralysis from a variety of conditions such as spinal cord injury.9,101,102
 

 

Mental Health

Due to the interactive nature of mental health care, the field has been slower to develop AI applications.18 With heavy reliance on textual information (eg, clinic notes, mood rating scales, and documentation of conversations), successful AI applications in this field will likely rely heavily on NLP.18 However, studies investigating the application of AI to mental health have also incorporated data such as brain imaging, smartphone monitoring, and social media platforms, such as Facebook and Twitter.18,103,104

The risk of suicide is higher in veteran patients, and ML algorithms have had limited success in predicting suicide risk in both veteran and nonveteran populations.104-106 While early models have low positive predictive values and low sensitivities, they still promise to be a useful tool in conjunction with traditional risk assessments.106 Kessler and colleagues suggest that combining multiple rather than single ML algorithms might lead to greater success.105,106

AI may assist in diagnosing other mental health disorders, including major depressive disorder, attention deficit hyperactivity disorder (ADHD), schizophrenia, posttraumatic stress disorder, and Alzheimer disease.103,104,107 These investigations are in the early stages with limited clinical applicability. However, 2 AI applications awaiting FDA approval relate to ADHD and opioid use.2 Furthermore, potential exists for AI to not only assist with prevention and diagnosis of ADHD, but also to identify optimal treatment options.2,103

General and Personalized Medicine

Additional AI applications include diagnosing patients with suspected sepsis, measuring liver iron concentrations, predicting hospital mortality at the time of admission, and more.2,108,109 AI can guide end-of-life decisions such as resuscitation status or whether to initiate mechanical ventilation.48

AI-driven smartphone apps can be beneficial to both patients and clinicians. Examples include predicting nonadherence to anticoagulation therapy, monitoring heart rhythms for atrial fibrillation or signs of hyperkalemia in patients with renal failure, and improving outcomes for patients with diabetes mellitus by decreasing glycemic variability and reducing hypoglycemia.8,48,110,111 The potential for AI applications to health care and personalized medicine are almost limitless.

Discussion

With ever-increasing expectations for all health care sectors to deliver timely, fiscally-responsible, high-quality health care, AI has the potential to have numerous impacts. AI can improve diagnostic accuracy while limiting errors and impact patient safety such as assisting with prescription delivery.8,9,34 It can screen and triage patients, alerting clinicians to those needing more urgent evaluation.8,23,77,97 AI also may increase a clinician’s efficiency and speed to render a diagnosis.12,13,55,97 AI can provide a rapid second opinion, an ability especially beneficial in underserved areas with shortages of specialists.23,25,26,29,34 Similarly, AI may decrease the inter- and intraobserver variability common in many medical specialties.12,27,45 AI applications can also monitor disease progression, identifying patients at greatest risk, and provide information for prognosis.21,23,56,58 Finally, as described with applications using IBM Watson, AI can allow for an integrated approach to health care that is currently lacking.

We have described many reports suggesting AI can render diagnoses as well as or better than experienced clinicians, and speculation exists that AI will replace many roles currently performed by health care practitioners.9,26 However, most studies demonstrate that AI’s diagnostic benefits are best realized when used to supplement a clinician’s impression.8,22,30,33,52,54,56,69,84 AI is not likely to replace humans in health care in the foreseeable future. The technology can be likened to the impact of CT scans developed in the 1970s in neurology. Prior to such detailed imaging, neurologists spent extensive time performing detailed physicals to render diagnoses and locate lesions before surgery. There was mistrust of this new technology and concern that CT scans would eliminate the need for neurologists.112 On the contrary, neurology is alive and well, frequently being augmented by the technologies once speculated to replace it.

Commercial AI health care platforms represented a $2 billion industry in 2018 and are growing rapidly each year.13,32 Many AI products are offered ready for implementation for various tasks, including diagnostics, patient management, and improved efficiency. Others will likely be provided as templates suitable for modification to meet the specific needs of the facility, practice, or specialty for its patient population.

 

 

AI Risks and Limitations

AI has several risks and limitations. Although there is progress in explainable AI, at times we still struggle to understand how the output provided by machine learning algorithms was created.44,48 The many layers associated with deep learning self-determine the criteria to reach its conclusion, and these criteria can continually evolve. The parameters of deep learning are not preprogrammed, and there are too many individual data points to be extrapolated or deconvoluted for evaluation at our current level of knowledge.26,51 These apparent lack of constraints cause concern for patient safety and suggest that greater validation and continued scrutiny of validity is required.8,48 Efforts are underway to create explainable AI programs to make their processes more transparent, but such clarification is limited presently.14,26,48,77

Another challenge of AI is determining the amount of training data required to function optimally. Also, if the output describes multiple variables or diagnoses, are each equally valid?113 Furthermore, many AI applications look for a specific process, such as cancer diagnoses on CXRs. However, how coexisting conditions like cardiomegaly, emphysema, pneumonia, etc, seen on CXRs will affect the diagnosis needs to be considered.51,52 Zech and colleagues provide the example that diagnoses for pneumothorax are frequently rendered on CXRs with chest tubes in place.51 They suggest that CNNs may develop a bias toward diagnosing pneumothorax when chest tubes are present. Many current studies approach an issue in isolation, a situation not realistic in real-world clinical practice.26

Most studies on AI have been retrospective, and frequently data used to train the program are preselected.13,26 The data are typically validated on available databases rather than actual patients in the clinical setting, limiting confidence in the validity of the AI output when applied to real-world situations. Currently, fewer than 12 prospective trials had been published comparing AI with traditional clinical care.13,114 Randomized prospective clinical trials are even fewer, with none currently reported from the United States.13,114 The results from several studies have been shown to diminish when repeated prospectively.114

The FDA has created a new category known as Software as a Medical Device and has a Digital Health Innovation Action Plan to regulate AI platforms. Still, the process of AI regulation is of necessity different from traditional approval processes and is continually evolving.8 The FDA approval process cannot account for the fact that the program’s parameters may continually evolve or adapt.2

Guidelines for investigating and reporting AI research with its unique attributes are being developed. Examples include the TRIPOD-ML statement and others.49,115 In September 2020, 2 publications addressed the paucity of gold-standard randomized clinical trials in clinical AI applications.116,117 The SPIRIT-AI statement expands on the original SPIRIT statement published in 2013 to guide minimal reporting standards for AI clinical trial protocols to promote transparency of design and methodology.116 Similarly, the CONSORT-AI extension, stemming from the original CONSORT statement in 1996, aims to ensure quality reporting of randomized controlled trials in AI.117

Another risk with AI is that while an individual physician making a mistake may adversely affect 1 patient, a single mistake in an AI algorithm could potentially affect thousands of patients.48 Also, AI programs developed for patient populations at a facility may not translate to another. Referred to as overfitting, this phenomenon relates to selection bias in training data sets.15,34,49,51,52 Studies have shown that programs that underrepresent certain group characteristics such as age, sex, or race may be less effective when applied to a population in which these characteristics have differing representations.8,48,49 This problem of underrepresentation has been demonstrated in programs interpreting pathology slides, radiographs, and skin lesions.15,32,51

Admittedly, most of these challenges are not specific to AI and existed in health care previously. Physicians make mistakes, treatments are sometimes used without adequate prospective studies, and medications are given without understanding their mechanism of action, much like AI-facilitated processes reach a conclusion that cannot be fully explained.48

Conclusions

The view that AI will dramatically impact health care in the coming years will likely prove true. However, much work is needed, especially because of the paucity of prospective clinical trials as has been historically required in medical research. Any concern that AI will replace HCPs seems unwarranted. Early studies suggest that even AI programs that appear to exceed human interpretation perform best when working in cooperation with and oversight from clinicians. AI’s greatest potential appears to be its ability to augment care from health professionals, improving efficiency and accuracy, and should be anticipated with enthusiasm as the field moves forward at an exponential rate.

Acknowledgments

The authors thank Makenna G. Thomas for proofreading and review of the manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital. This research has been approved by the James A. Haley Veteran’s Hospital Office of Communications and Media.

Artificial Intelligence (AI) was first described in 1956 and refers to machines having the ability to learn as they receive and process information, resulting in the ability to “think” like humans.1 AI’s impact in medicine is increasing; currently, at least 29 AI medical devices and algorithms are approved by the US Food and Drug Administration (FDA) in a variety of areas, including radiograph interpretation, managing glucose levels in patients with diabetes mellitus, analyzing electrocardiograms (ECGs), and diagnosing sleep disorders among others.2 Significantly, in 2020, the Centers for Medicare and Medicaid Services (CMS) announced the first reimbursement to hospitals for an AI platform, a model for early detection of strokes.3 AI is rapidly becoming an integral part of health care, and its role will only increase in the future (Table).

fdp03811527_t.png

As knowledge in medicine is expanding exponentially, AI has great potential to assist with handling complex patient care data. The concept of exponential growth is not a natural one. As Bini described, with exponential growth the volume of knowledge amassed over the past 10 years will now occur in perhaps only 1 year.1 Likewise, equivalent advances over the past year may take just a few months. This phenomenon is partly due to the law of accelerating returns, which states that advances feed on themselves, continually increasing the rate of further advances.4 The volume of medical data doubles every 2 to 5 years.5 Fortunately, the field of AI is growing exponentially as well and can help health care practitioners (HCPs) keep pace, allowing the continued delivery of effective health care.

In this report, we review common terminology, principles, and general applications of AI, followed by current and potential applications of AI for selected medical specialties. Finally, we discuss AI’s future in health care, along with potential risks and pitfalls.

 

AI Overview

AI refers to machine programs that can “learn” or think based on past experiences. This functionality contrasts with simple rules-based programming available to health care for years. An example of rules-based programming is the warfarindosing.org website developed by Barnes-Jewish Hospital at Washington University Medical Center, which guides initial warfarin dosing.6,7 The prescriber inputs detailed patient information, including age, sex, height, weight, tobacco history, medications, laboratory results, and genotype if available. The application then calculates recommended warfarin dosing regimens to avoid over- or underanticoagulation. While the dosing algorithm may be complex, it depends entirely on preprogrammed rules. The program does not learn to reach its conclusions and recommendations from patient data.

In contrast, one of the most common subsets of AI is machine learning (ML). ML describes a program that “learns from experience and improves its performance as it learns.”1 With ML, the computer is initially provided with a training data set—data with known outcomes or labels. Because the initial data are input from known samples, this type of AI is known as supervised learning.8-10 As an example, we recently reported using ML to diagnose various types of cancer from pathology slides.11 In one experiment, we captured images of colon adenocarcinoma and normal colon (these 2 groups represent the training data set). Unlike traditional programming, we did not define characteristics that would differentiate colon cancer from normal; rather, the machine learned these characteristics independently by assessing the labeled images provided. A second data set (the validation data set) was used to evaluate the program and fine-tune the ML training model’s parameters. Finally, the program was presented with new images of cancer and normal cases for final assessment of accuracy (test data set). Our program learned to recognize differences from the images provided and was able to differentiate normal and cancer images with > 95% accuracy.

Advances in computer processing have allowed for the development of artificial neural networks (ANNs). While there are several types of ANNs, the most common types used for image classification and segmentation are known as convolutional neural networks (CNNs).9,12-14 The programs are designed to work similar to the human brain, specifically the visual cortex.15,16 As data are acquired, they are processed by various layers in the program. Much like neurons in the brain, one layer decides whether to advance information to the next.13,14 CNNs can be many layers deep, leading to the term deep learning: “computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction.”1,13,17

ANNs can process larger volumes of data. This advance has led to the development of unstructured or unsupervised learning. With this type of learning, imputing defined features (ie, predetermined answers) of the training data set described above is no longer required.1,8,10,14 The advantage of unsupervised learning is that the program can be presented raw data and extract meaningful interpretation without human input, often with less bias than may exist with supervised learning.1,18 If shown enough data, the program can extract relevant features to make conclusions independently without predefined definitions, potentially uncovering markers not previously known. For example, several studies have used unsupervised learning to search patient data to assess readmission risks of patients with congestive heart failure.10,19,20 AI compiled features independently and not previously defined, predicting patients at greater risk for readmission superior to traditional methods.

fdp03811527_f.png


A more detailed description of the various terminologies and techniques of AI is beyond the scope of this review.9,10,17,21 However, in this basic overview, we describe 4 general areas that AI impacts health care (Figure).

 

 

Health Care Applications

Image analysis has seen the most AI health care applications.8,15 AI has shown potential in interpreting many types of medical images, including pathology slides, radiographs of various types, retina and other eye scans, and photographs of skin lesions. Many studies have demonstrated that AI can interpret these images as accurately as or even better than experienced clinicians.9,13,22-29 Studies have suggested AI interpretation of radiographs may better distinguish patients infected with COVID-19 from other causes of pneumonia, and AI interpretation of pathology slides may detect specific genetic mutations not previously identified without additional molecular tests.11,14,23,24,30-32

The second area in which AI can impact health care is improving workflow and efficiency. AI has improved surgery scheduling, saving significant revenue, and decreased patient wait times for appointments.1 AI can screen and triage radiographs, allowing attention to be directed to critical patients. This use would be valuable in many busy clinical settings, such as the recent COVID-19 pandemic.8,23 Similarly, AI can screen retina images to prioritize urgent conditions.25 AI has improved pathologists’ efficiency when used to detect breast metastases.33 Finally, AI may reduce medical errors, thereby ensuring patient safety.8,9,34

A third health care benefit of AI is in public health and epidemiology. AI can assist with clinical decision-making and diagnoses in low-income countries and areas with limited health care resources and personnel.25,29 AI can improve identification of infectious outbreaks, such as tuberculosis, malaria, dengue fever, and influenza.29,35-40 AI has been used to predict transmission patterns of the Zika virus and the current COVID-19 pandemic.41,42 Applications can stratify the risk of outbreaks based on multiple factors, including age, income, race, atypical geographic clusters, and seasonal factors like rainfall and temperature.35,36,38,43 AI has been used to assess morbidity and mortality, such as predicting disease severity with malaria and identifying treatment failures in tuberculosis.29

Finally, AI can dramatically impact health care due to processing large data sets or disconnected volumes of patient information—so-called big data.44-46 An example is the widespread use of electronic health records (EHRs) such as the Computerized Patient Record System used in Veteran Affairs medical centers (VAMCs). Much of patient information exists in written text: HCP notes, laboratory and radiology reports, medication records, etc. Natural language processing (NLP) allows platforms to sort through extensive volumes of data on complex patients at rates much faster than human capability, which has great potential to assist with diagnosis and treatment decisions.9

Medical literature is being produced at rates that exceed our ability to digest. More than 200,000 cancer-related articles were published in 2019 alone.14 NLP capabilities of AI have the potential to rapidly sort through this extensive medical literature and relate specific verbiage in patient records guiding therapy.46 IBM Watson, a supercomputer based on ML and NLP, demonstrates this concept with many potential applications, only some of which relate to health care.1,9 Watson has an oncology component to assimilate multiple aspects of patient care, including clinical notes, pathology results, radiograph findings, staging, and a tumor’s genetic profile. It coordinates these inputs from the EHR and mines medical literature and research databases to recommend treatment options.1,46 AI can assess and compile far greater patient data and therapeutic options than would be feasible by individual clinicians, thus providing customized patient care.47 Watson has partnered with numerous medical centers, including MD Anderson Cancer Center and Memorial Sloan Kettering Cancer Center, with variable success.44,47-49 While the full potential of Watson appears not yet realized, these AI-driven approaches will likely play an important role in leveraging the hidden value in the expanding volume of health care information.

Medical Specialty Applications

Radiology

Currently > 70% of FDA-approved AI medical devices are in the field of radiology.2 Most radiology departments have used AI-friendly digital imaging for years, such as the picture archiving and communication systems used by numerous health care systems, including VAMCs.2,15 Gray-scale images common in radiology lend themselves to standardization, although AI is not limited to black-and- white image interpretation.15

An abundance of literature describes plain radiograph interpretation using AI. One FDA-approved platform improved X-ray diagnosis of wrist fractures when used by emergency medicine clinicians.2,50 AI has been applied to chest X-ray (CXR) interpretation of many conditions, including pneumonia, tuberculosis, malignant lung lesions, and COVID-19.23,25,28,44,51-53 For example, Nam and colleagues suggested AI is better at diagnosing malignant pulmonary nodules from CXRs than are trained radiologists.28

In addition to plain radiographs, AI has been applied to many other imaging technologies, including ultrasounds, positron emission tomography, mammograms, computed tomography (CT), and magnetic resonance imaging (MRI).15,26,44,48,54-56 A large study demonstrated that ML platforms significantly reduced the time to diagnose intracranial hemorrhages on CT and identified subtle hemorrhages missed by radiologists.55 Other studies have claimed that AI programs may be better than radiologists in detecting cancer in screening mammograms, and 3 FDA-approved devices focus on mammogram interpretation.2,15,54,57 There is also great interest in MRI applications to detect and predict prognosis for breast cancer based on imaging findings.21,56

Aside from providing accurate diagnoses, other studies focus on AI radiograph interpretation to assist with patient screening, triage, improving time to final diagnosis, providing a rapid “second opinion,” and even monitoring disease progression and offering insights into prognosis.8,21,23,52,55,56,58 These features help in busy urban centers but may play an even greater role in areas with limited access to health care or trained specialists such as radiologists.52

 

 

Cardiology

Cardiology has the second highest number of FDA-approved AI applications.2 Many cardiology AI platforms involve image analysis, as described in several recent reviews.45,59,60 AI has been applied to echocardiography to measure ejection fractions, detect valvular disease, and assess heart failure from hypertrophic and restrictive cardiomyopathy and amyloidosis.45,48,59 Applications for cardiac CT scans and CT angiography have successfully quantified both calcified and noncalcified coronary artery plaques and lumen assessments, assessed myocardial perfusion, and performed coronary artery calcium scoring.45,59,60 Likewise, AI applications for cardiac MRI have been used to quantitate ejection fraction, large vessel flow assessment, and cardiac scar burden.45,59

For years ECG devices have provided interpretation with limited accuracy using preprogrammed parameters.48 However, the application of AI allows ECG interpretation on par with trained cardiologists. Numerous such AI applications exist, and 2 FDA-approved devices perform ECG interpretation.2,61-64 One of these devices incorporates an AI-powered stethoscope to detect atrial fibrillation and heart murmurs.65

Pathology

The advancement of whole slide imaging, wherein entire slides can be scanned and digitized at high speed and resolution, creates great potential for AI applications in pathology.12,24,32,33,66 A landmark study demonstrating the potential of AI for assessing whole slide imaging examined sentinel lymph node metastases in patients with breast cancer.22 Multiple algorithms in the study demonstrated that AI was equivalent or better than pathologists in detecting metastases, especially when the pathologists were time-constrained consistent with a normal working environment. Significantly, the most accurate and efficient diagnoses were achieved when the pathologist and AI interpretations were used together.22,33

AI has shown promise in diagnosing many other entities, including cancers of the prostate (including Gleason scoring), lung, colon, breast, and skin.11,12,24,27,32,67 In addition, AI has shown great potential in scoring biomarkers important for prognosis and treatment, such as immunohistochemistry (IHC) labeling of Ki-67 and PD-L1.32 Pathologists can have difficulty classifying certain tumors or determining the site of origin for metastases, often having to rely on IHC with limited success. The unique features of image analysis with AI have the potential to assist in classifying difficult tumors and identifying sites of origin for metastatic disease based on morphology alone.11

Oncology depends heavily on molecular pathology testing to dictate treatment options and determine prognosis. Preliminary studies suggest that AI interpretation alone has the potential to delineate whether certain molecular mutations are present in tumors from various sites.11,14,24,32 One study combined histology and genomic results for AI interpretation that improved prognostic predictions.68 In addition, AI analysis may have potential in predicting tumor recurrence or prognosis based on cellular features, as demonstrated for lung cancer and melanoma.67,69,70

Ophthalmology

AI applications for ophthalmology have focused on diabetic retinopathy, age-related macular degeneration, glaucoma, retinopathy of prematurity, age-related and congenital cataracts, and retinal vein occlusion.71-73 Diabetic retinopathy is a leading cause of blindness and has been studied by numerous platforms with good success, most having used color fundus photography.71,72 One study showed AI could diagnose diabetic retinopathy and diabetic macular edema with specificities similar to ophthalmologists.74 In 2018, the FDA approved the AI platform IDx-DR. This diagnostic system classifies retinal images and recommends referral for patients determined to have “more than mild diabetic retinopathy” and reexamination within a year for other patients.8,75 Significantly, the platform recommendations do not require confirmation by a clinician.8

AI has been applied to other modalities in ophthalmology such as optical coherence tomography (OCT) to diagnose retinal disease and to predict appropriate management of congenital cataracts.25,73,76 For example, an AI application using OCT has been demonstrated to match or exceed the accuracy of retinal experts in diagnosing and triaging patients with a variety of retinal pathologies, including patients needing urgent referrals.77

Dermatology

Multiple studies demonstrate AI performs at least equal to experienced dermatologists in differentiating selected skin lesions.78-81 For example, Esteva and colleagues demonstrated AI could differentiate keratinocyte carcinomas from benign seborrheic keratoses and malignant melanomas from benign nevi with accuracy equal to 21 board-certified dermatologists.78

 

 

AI is applicable to various imaging procedures common to dermatology, such as dermoscopy, very high-frequency ultrasound, and reflectance confocal microscopy.82 Several studies have demonstrated that AI interpretation compared favorably to dermatologists evaluating dermoscopy to assess melanocytic lesions.78-81,83

A limitation in these studies is that they differentiate only a few diagnoses.82 Furthermore, dermatologists have sensory input such as touch and visual examination under various conditions, something AI has yet to replicate.15,34,84 Also, most AI devices use no or limited clinical information.81 Dermatologists can recognize rarer conditions for which AI models may have had limited or no training.34 Nevertheless, a recent study assessed AI for the diagnosis of 134 separate skin disorders with promising results, including providing diagnoses with accuracy comparable to that of dermatologists and providing accurate treatment strategies.84 As Topol points out, most skin lesions are diagnosed in the primary care setting where AI can have a greater impact when used in conjunction with the clinical impression, especially where specialists are in limited supply.48,78

Finally, dermatology lends itself to using portable or smartphone applications (apps) wherein the user can photograph a lesion for analysis by AI algorithms to assess the need for further evaluation or make treatment recommendations.34,84,85 Although results from currently available apps are not encouraging, they may play a greater role as the technology advances.34,85

 

Oncology

Applications of AI in oncology include predicting prognosis for patients with cancer based on histologic and/or genetic information.14,68,86 Programs can predict the risk of complications before and recurrence risks after surgery for malignancies.44,87-89 AI can also assist in treatment planning and predict treatment failure with radiation therapy.90,91

AI has great potential in processing the large volumes of patient data in cancer genomics. Next-generation sequencing has allowed for the identification of millions of DNA sequences in a single tumor to detect genetic anomalies.92 Thousands of mutations can be found in individual tumor samples, and processing this information and determining its significance can be beyond human capability.14 We know little about the effects of various mutation combinations, and most tumors have a heterogeneous molecular profile among different cell populations.14,93 The presence or absence of various mutations can have diagnostic, prognostic, and therapeutic implications.93 AI has great potential to sort through these complex data and identify actionable findings.

More than 200,000 cancer-related articles were published in 2019, and publications in the field of cancer genomics are increasing exponentially.14,92,93 Patel and colleagues assessed the utility of IBM Watson for Genomics against results from a molecular tumor board.93 Watson for Genomics identified potentially significant mutations not identified by the tumor board in 32% of patients. Most mutations were related to new clinical trials not yet added to the tumor board watch list, demonstrating the role AI will have in processing the large volume of genetic data required to deliver personalized medicine moving forward.

Gastroenterology

AI has shown promise in predicting risk or outcomes based on clinical parameters in various common gastroenterology problems, including gastric reflux, acute pancreatitis, gastrointestinal bleeding, celiac disease, and inflammatory bowel disease.94,95 AI endoscopic analysis has demonstrated potential in assessing Barrett’s esophagus, gastric Helicobacter pylori infections, gastric atrophy, and gastric intestinal metaplasia.95 Applications have been used to assess esophageal, gastric, and colonic malignancies, including depth of invasion based on endoscopic images.95 Finally, studies have evaluated AI to assess small colon polyps during colonoscopy, including differentiating benign and premalignant polyps with success comparable to gastroenterologists.94,95 AI has been shown to increase the speed and accuracy of gastroenterologists in detecting small polyps during colonoscopy.48 In a prospective randomized study, colonoscopies performed using an AI device identified significantly more small adenomatous polyps than colonoscopies without AI.96

Neurology

It has been suggested that AI technologies are well suited for application in neurology due to the subtle presentation of many neurologic diseases.16 Viz LVO, the first CMS-approved AI reimbursement for the diagnosis of strokes, analyzes CTs to detect early ischemic strokes and alerts the medical team, thus shortening time to treatment.3,97 Many other AI platforms are in use or development that use CT and MRI for the early detection of strokes as well as for treatment and prognosis.9,97

AI technologies have been applied to neurodegenerative diseases, such as Alzheimer and Parkinson diseases.16,98 For example, several studies have evaluated patient movements in Parkinson disease for both early diagnosis and to assess response to treatment.98 These evaluations included assessment with both external cameras as well as wearable devices and smartphone apps.

 

 



AI has also been applied to seizure disorders, attempting to determine seizure type, localize the area of seizure onset, and address the challenges of identifying seizures in neonates.99,100 Other potential applications range from early detection and prognosis predictions for cases of multiple sclerosis to restoring movement in paralysis from a variety of conditions such as spinal cord injury.9,101,102
 

 

Mental Health

Due to the interactive nature of mental health care, the field has been slower to develop AI applications.18 With heavy reliance on textual information (eg, clinic notes, mood rating scales, and documentation of conversations), successful AI applications in this field will likely rely heavily on NLP.18 However, studies investigating the application of AI to mental health have also incorporated data such as brain imaging, smartphone monitoring, and social media platforms, such as Facebook and Twitter.18,103,104

The risk of suicide is higher in veteran patients, and ML algorithms have had limited success in predicting suicide risk in both veteran and nonveteran populations.104-106 While early models have low positive predictive values and low sensitivities, they still promise to be a useful tool in conjunction with traditional risk assessments.106 Kessler and colleagues suggest that combining multiple rather than single ML algorithms might lead to greater success.105,106

AI may assist in diagnosing other mental health disorders, including major depressive disorder, attention deficit hyperactivity disorder (ADHD), schizophrenia, posttraumatic stress disorder, and Alzheimer disease.103,104,107 These investigations are in the early stages with limited clinical applicability. However, 2 AI applications awaiting FDA approval relate to ADHD and opioid use.2 Furthermore, potential exists for AI to not only assist with prevention and diagnosis of ADHD, but also to identify optimal treatment options.2,103

General and Personalized Medicine

Additional AI applications include diagnosing patients with suspected sepsis, measuring liver iron concentrations, predicting hospital mortality at the time of admission, and more.2,108,109 AI can guide end-of-life decisions such as resuscitation status or whether to initiate mechanical ventilation.48

AI-driven smartphone apps can be beneficial to both patients and clinicians. Examples include predicting nonadherence to anticoagulation therapy, monitoring heart rhythms for atrial fibrillation or signs of hyperkalemia in patients with renal failure, and improving outcomes for patients with diabetes mellitus by decreasing glycemic variability and reducing hypoglycemia.8,48,110,111 The potential for AI applications to health care and personalized medicine are almost limitless.

Discussion

With ever-increasing expectations for all health care sectors to deliver timely, fiscally-responsible, high-quality health care, AI has the potential to have numerous impacts. AI can improve diagnostic accuracy while limiting errors and impact patient safety such as assisting with prescription delivery.8,9,34 It can screen and triage patients, alerting clinicians to those needing more urgent evaluation.8,23,77,97 AI also may increase a clinician’s efficiency and speed to render a diagnosis.12,13,55,97 AI can provide a rapid second opinion, an ability especially beneficial in underserved areas with shortages of specialists.23,25,26,29,34 Similarly, AI may decrease the inter- and intraobserver variability common in many medical specialties.12,27,45 AI applications can also monitor disease progression, identifying patients at greatest risk, and provide information for prognosis.21,23,56,58 Finally, as described with applications using IBM Watson, AI can allow for an integrated approach to health care that is currently lacking.

We have described many reports suggesting AI can render diagnoses as well as or better than experienced clinicians, and speculation exists that AI will replace many roles currently performed by health care practitioners.9,26 However, most studies demonstrate that AI’s diagnostic benefits are best realized when used to supplement a clinician’s impression.8,22,30,33,52,54,56,69,84 AI is not likely to replace humans in health care in the foreseeable future. The technology can be likened to the impact of CT scans developed in the 1970s in neurology. Prior to such detailed imaging, neurologists spent extensive time performing detailed physicals to render diagnoses and locate lesions before surgery. There was mistrust of this new technology and concern that CT scans would eliminate the need for neurologists.112 On the contrary, neurology is alive and well, frequently being augmented by the technologies once speculated to replace it.

Commercial AI health care platforms represented a $2 billion industry in 2018 and are growing rapidly each year.13,32 Many AI products are offered ready for implementation for various tasks, including diagnostics, patient management, and improved efficiency. Others will likely be provided as templates suitable for modification to meet the specific needs of the facility, practice, or specialty for its patient population.

 

 

AI Risks and Limitations

AI has several risks and limitations. Although there is progress in explainable AI, at times we still struggle to understand how the output provided by machine learning algorithms was created.44,48 The many layers associated with deep learning self-determine the criteria to reach its conclusion, and these criteria can continually evolve. The parameters of deep learning are not preprogrammed, and there are too many individual data points to be extrapolated or deconvoluted for evaluation at our current level of knowledge.26,51 These apparent lack of constraints cause concern for patient safety and suggest that greater validation and continued scrutiny of validity is required.8,48 Efforts are underway to create explainable AI programs to make their processes more transparent, but such clarification is limited presently.14,26,48,77

Another challenge of AI is determining the amount of training data required to function optimally. Also, if the output describes multiple variables or diagnoses, are each equally valid?113 Furthermore, many AI applications look for a specific process, such as cancer diagnoses on CXRs. However, how coexisting conditions like cardiomegaly, emphysema, pneumonia, etc, seen on CXRs will affect the diagnosis needs to be considered.51,52 Zech and colleagues provide the example that diagnoses for pneumothorax are frequently rendered on CXRs with chest tubes in place.51 They suggest that CNNs may develop a bias toward diagnosing pneumothorax when chest tubes are present. Many current studies approach an issue in isolation, a situation not realistic in real-world clinical practice.26

Most studies on AI have been retrospective, and frequently data used to train the program are preselected.13,26 The data are typically validated on available databases rather than actual patients in the clinical setting, limiting confidence in the validity of the AI output when applied to real-world situations. Currently, fewer than 12 prospective trials had been published comparing AI with traditional clinical care.13,114 Randomized prospective clinical trials are even fewer, with none currently reported from the United States.13,114 The results from several studies have been shown to diminish when repeated prospectively.114

The FDA has created a new category known as Software as a Medical Device and has a Digital Health Innovation Action Plan to regulate AI platforms. Still, the process of AI regulation is of necessity different from traditional approval processes and is continually evolving.8 The FDA approval process cannot account for the fact that the program’s parameters may continually evolve or adapt.2

Guidelines for investigating and reporting AI research with its unique attributes are being developed. Examples include the TRIPOD-ML statement and others.49,115 In September 2020, 2 publications addressed the paucity of gold-standard randomized clinical trials in clinical AI applications.116,117 The SPIRIT-AI statement expands on the original SPIRIT statement published in 2013 to guide minimal reporting standards for AI clinical trial protocols to promote transparency of design and methodology.116 Similarly, the CONSORT-AI extension, stemming from the original CONSORT statement in 1996, aims to ensure quality reporting of randomized controlled trials in AI.117

Another risk with AI is that while an individual physician making a mistake may adversely affect 1 patient, a single mistake in an AI algorithm could potentially affect thousands of patients.48 Also, AI programs developed for patient populations at a facility may not translate to another. Referred to as overfitting, this phenomenon relates to selection bias in training data sets.15,34,49,51,52 Studies have shown that programs that underrepresent certain group characteristics such as age, sex, or race may be less effective when applied to a population in which these characteristics have differing representations.8,48,49 This problem of underrepresentation has been demonstrated in programs interpreting pathology slides, radiographs, and skin lesions.15,32,51

Admittedly, most of these challenges are not specific to AI and existed in health care previously. Physicians make mistakes, treatments are sometimes used without adequate prospective studies, and medications are given without understanding their mechanism of action, much like AI-facilitated processes reach a conclusion that cannot be fully explained.48

Conclusions

The view that AI will dramatically impact health care in the coming years will likely prove true. However, much work is needed, especially because of the paucity of prospective clinical trials as has been historically required in medical research. Any concern that AI will replace HCPs seems unwarranted. Early studies suggest that even AI programs that appear to exceed human interpretation perform best when working in cooperation with and oversight from clinicians. AI’s greatest potential appears to be its ability to augment care from health professionals, improving efficiency and accuracy, and should be anticipated with enthusiasm as the field moves forward at an exponential rate.

Acknowledgments

The authors thank Makenna G. Thomas for proofreading and review of the manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital. This research has been approved by the James A. Haley Veteran’s Hospital Office of Communications and Media.

References

1. Bini SA. Artificial intelligence, machine learning, deep learning, and cognitive computing: what do these terms mean and how will they impact health care? J Arthroplasty. 2018;33(8):2358-2361. doi:10.1016/j.arth.2018.02.067

2. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit Med. 2020;3:118. doi:10.1038/s41746-020-00324-0

3. Viz. AI powered synchronized stroke care. Accessed September 15, 2021. https://www.viz.ai/ischemic-stroke

4. Buchanan M. The law of accelerating returns. Nat Phys. 2008;4(7):507. doi:10.1038/nphys1010

5. IBM Watson Health computes a pair of new solutions to improve healthcare data and security. Published September 10, 2015. Accessed October 21, 2020. https://www.techrepublic.com/article/ibm-watson-health-computes-a-pair-of-new-solutions-to-improve-healthcare-data-and-security

6. Borkowski AA, Kardani A, Mastorides SM, Thomas LB. Warfarin pharmacogenomics: recommendations with available patented clinical technologies. Recent Pat Biotechnol. 2014;8(2):110-115. doi:10.2174/1872208309666140904112003

7. Washington University in St. Louis. Warfarin dosing. Accessed September 15, 2021. http://www.warfarindosing.org/Source/Home.aspx

8. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med. 2019;25(1):30-36. doi:10.1038/s41591-018-0307-0

9. Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243. Published 2017 Jun 21. doi:10.1136/svn-2017-000101

10. Johnson KW, Torres Soto J, Glicksberg BS, et al. Artificial intelligence in cardiology. J Am Coll Cardiol. 2018;71(23):2668-2679. doi:10.1016/j.jacc.2018.03.521

11. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.

12. Cruz-Roa A, Gilmore H, Basavanhally A, et al. High-throughput adaptive sampling for whole-slide histopathology image analysis (HASHI) via convolutional neural networks: application to invasive breast cancer detection. PLoS One. 2018;13(5):e0196828. Published 2018 May 24. doi:10.1371/journal.pone.0196828

13. Nagendran M, Chen Y, Lovejoy CA, et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ. 2020;368:m689. Published 2020 Mar 25. doi:10.1136/bmj.m689

14. Shimizu H, Nakayama KI. Artificial intelligence in oncology. Cancer Sci. 2020;111(5):1452-1460. doi:10.1111/cas.14377

15. Talebi-Liasi F, Markowitz O. Is artificial intelligence going to replace dermatologists?. Cutis. 2020;105(1):28-31.

16. Valliani AA, Ranti D, Oermann EK. Deep learning and neurology: a systematic review. Neurol Ther. 2019;8(2):351-365. doi:10.1007/s40120-019-00153-8

17. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539

18. Graham S, Depp C, Lee EE, et al. Artificial intelligence for mental health and mental illnesses: an overview. Curr Psychiatry Rep. 2019;21(11):116. Published 2019 Nov 7. doi:10.1007/s11920-019-1094-0

19. Golas SB, Shibahara T, Agboola S, et al. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data. BMC Med Inform Decis Mak. 2018;18(1):44. Published 2018 Jun 22. doi:10.1186/s12911-018-0620-z

20. Mortazavi BJ, Downing NS, Bucholz EM, et al. Analysis of machine learning techniques for heart failure readmissions. Circ Cardiovasc Qual Outcomes. 2016;9(6):629-640. doi:10.1161/CIRCOUTCOMES.116.003039

21. Meyer-Bäse A, Morra L, Meyer-Bäse U, Pinker K. Current status and future perspectives of artificial intelligence in magnetic resonance breast imaging. Contrast Media Mol Imaging. 2020;2020:6805710. Published 2020 Aug 28. doi:10.1155/2020/6805710

22. Ehteshami Bejnordi B, Veta M, Johannes van Diest P, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210. doi:10.1001/jama.2017.14585

23. Borkowski AA, Viswanadhan NA, Thomas LB, Guzman RD, Deland LA, Mastorides SM. Using artificial intelligence for COVID-19 chest X-ray diagnosis. Fed Pract. 2020;37(9):398-404. doi:10.12788/fp.0045

24. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567. doi:10.1038/s41591-018-0177-5

25. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010

26. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health. 2019;1(6):e271-e297. doi:10.1016/S2589-7500(19)30123-2

27. Nagpal K, Foote D, Liu Y, et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer [published correction appears in NPJ Digit Med. 2019 Nov 19;2:113]. NPJ Digit Med. 2019;2:48. Published 2019 Jun 7. doi:10.1038/s41746-019-0112-2

28. Nam JG, Park S, Hwang EJ, et al. Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019;290(1):218-228. doi:10.1148/radiol.2018180237

29. Schwalbe N, Wahl B. Artificial intelligence and the future of global health. Lancet. 2020;395(10236):1579-1586. doi:10.1016/S0140-6736(20)30226-9

30. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT [published correction appears in Radiology. 2021 Apr;299(1):E225]. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491

31. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905

32. Serag A, Ion-Margineanu A, Qureshi H, et al. Translational AI and deep learning in diagnostic pathology. Front Med (Lausanne). 2019;6:185. Published 2019 Oct 1. doi:10.3389/fmed.2019.00185

<--pagebreak-->

33. Wang D, Khosla A, Gargeya R, Irshad H, Beck AH. Deep learning for identifying metastatic breast cancer. ArXiv. 2016 June 18:arXiv:1606.05718v1. Published online June 18, 2016. Accessed September 15, 2021. http://arxiv.org/abs/1606.05718

34. Alabdulkareem A. Artificial intelligence and dermatologists: friends or foes? J Dermatology Dermatol Surg. 2019;23(2):57-60. doi:10.4103/jdds.jdds_19_19

35. Mollalo A, Mao L, Rashidi P, Glass GE. A GIS-based artificial neural network model for spatial distribution of tuberculosis across the continental United States. Int J Environ Res Public Health. 2019;16(1):157. Published 2019 Jan 8. doi:10.3390/ijerph16010157

36. Haddawy P, Hasan AHMI, Kasantikul R, et al. Spatiotemporal Bayesian networks for malaria prediction. Artif Intell Med. 2018;84:127-138. doi:10.1016/j.artmed.2017.12.002

37. Laureano-Rosario AE, Duncan AP, Mendez-Lazaro PA, et al. Application of artificial neural networks for dengue fever outbreak predictions in the northwest coast of Yucatan, Mexico and San Juan, Puerto Rico. Trop Med Infect Dis. 2018;3(1):5. Published 2018 Jan 5. doi:10.3390/tropicalmed3010005

38. Buczak AL, Koshute PT, Babin SM, Feighner BH, Lewis SH. A data-driven epidemiological prediction method for dengue outbreaks using local and remote sensing data. BMC Med Inform Decis Mak. 2012;12:124. Published 2012 Nov 5. doi:10.1186/1472-6947-12-124

39. Scavuzzo JM, Trucco F, Espinosa M, et al. Modeling dengue vector population using remotely sensed data and machine learning. Acta Trop. 2018;185:167-175. doi:10.1016/j.actatropica.2018.05.003

40. Xue H, Bai Y, Hu H, Liang H. Influenza activity surveillance based on multiple regression model and artificial neural network. IEEE Access. 2018;6:563-575. doi:10.1109/ACCESS.2017.2771798

41. Jiang D, Hao M, Ding F, Fu J, Li M. Mapping the transmission risk of Zika virus using machine learning models. Acta Trop. 2018;185:391-399. doi:10.1016/j.actatropica.2018.06.021

42. Bragazzi NL, Dai H, Damiani G, Behzadifar M, Martini M, Wu J. How big data and artificial intelligence can help better manage the COVID-19 pandemic. Int J Environ Res Public Health. 2020;17(9):3176. Published 2020 May 2. doi:10.3390/ijerph17093176

43. Lake IR, Colón-González FJ, Barker GC, Morbey RA, Smith GE, Elliot AJ. Machine learning to refine decision making within a syndromic surveillance service. BMC Public Health. 2019;19(1):559. Published 2019 May 14. doi:10.1186/s12889-019-6916-9

44. Khan OF, Bebb G, Alimohamed NA. Artificial intelligence in medicine: what oncologists need to know about its potential-and its limitations. Oncol Exch. 2017;16(4):8-13. Accessed September 1, 2021. http://www.oncologyex.com/pdf/vol16_no4/feature_khan-ai.pdf

45. Badano LP, Keller DM, Muraru D, Torlasco C, Parati G. Artificial intelligence and cardiovascular imaging: A win-win combination. Anatol J Cardiol. 2020;24(4):214-223. doi:10.14744/AnatolJCardiol.2020.94491

46. Murdoch TB, Detsky AS. The inevitable application of big data to health care. JAMA. 2013;309(13):1351-1352. doi:10.1001/jama.2013.393

47. Greatbatch O, Garrett A, Snape K. The impact of artificial intelligence on the current and future practice of clinical cancer genomics. Genet Res (Camb). 2019;101:e9. Published 2019 Oct 31. doi:10.1017/S0016672319000089

48. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. doi:10.1038/s41591-018-0300-7

49. Vollmer S, Mateen BA, Bohner G, et al. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness [published correction appears in BMJ. 2020 Apr 1;369:m1312]. BMJ. 2020;368:l6927. Published 2020 Mar 20. doi:10.1136/bmj.l6927

50. Lindsey R, Daluiski A, Chopra S, et al. Deep neural network improves fracture detection by clinicians. Proc Natl Acad Sci U S A. 2018;115(45):11591-11596. doi:10.1073/pnas.1806905115

51. Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 2018;15(11):e1002683. doi:10.1371/journal.pmed.1002683

52. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582. doi:10.1148/radiol.2017162326

53. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. ArXiv. 2020 Feb 26:arXiv:2002.11379v2. Revised March 11, 2020. Accessed September 15, 2021. http://arxiv.org/abs/2002.11379

54. Salim M, Wåhlin E, Dembrower K, et al. External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol. 2020;6(10):1581-1588. doi:10.1001/jamaoncol.2020.3321

55. Arbabshirani MR, Fornwalt BK, Mongelluzzo GJ, et al. Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. NPJ Digit Med. 2018;1:9. doi:10.1038/s41746-017-0015-z

56. Sheth D, Giger ML. Artificial intelligence in the interpretation of breast cancer on MRI. J Magn Reson Imaging. 2020;51(5):1310-1324. doi:10.1002/jmri.26878

57. McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577(7788):89-94. doi:10.1038/s41586-019-1799-6

58. Booth AL, Abels E, McCaffrey P. Development of a prognostic model for mortality in COVID-19 infection using machine learning. Mod Pathol. 2021;34(3):522-531. doi:10.1038/s41379-020-00700-x

59. Xu B, Kocyigit D, Grimm R, Griffin BP, Cheng F. Applications of artificial intelligence in multimodality cardiovascular imaging: a state-of-the-art review. Prog Cardiovasc Dis. 2020;63(3):367-376. doi:10.1016/j.pcad.2020.03.003

60. Dey D, Slomka PJ, Leeson P, et al. Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. J Am Coll Cardiol. 2019;73(11):1317-1335. doi:10.1016/j.jacc.2018.12.054

61. Carewell Health. AI powered ECG diagnosis solutions. Accessed November 2, 2020. https://www.carewellhealth.com/products_aiecg.html

62. Strodthoff N, Strodthoff C. Detecting and interpreting myocardial infarction using fully convolutional neural networks. Physiol Meas. 2019;40(1):015001. doi:10.1088/1361-6579/aaf34d

63. Hannun AY, Rajpurkar P, Haghpanahi M, et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25(1):65-69. doi:10.1038/s41591-018-0268-3

64. Kwon JM, Jeon KH, Kim HM, et al. Comparing the performance of artificial intelligence and conventional diagnosis criteria for detecting left ventricular hypertrophy using electrocardiography. Europace. 2020;22(3):412-419. doi:10.1093/europace/euz324

<--pagebreak-->

65. Eko. FDA clears Eko’s AFib and heart murmur detection algorithms, making it the first AI-powered stethoscope to screen for serious heart conditions [press release]. Published January 28, 2020. Accessed September 15, 2021. https://www.businesswire.com/news/home/20200128005232/en/FDA-Clears-Eko’s-AFib-and-Heart-Murmur-Detection-Algorithms-Making-It-the-First-AI-Powered-Stethoscope-to-Screen-for-Serious-Heart-Conditions

66. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450. doi:10.1038/srep46450

67. Acs B, Rantalainen M, Hartman J. Artificial intelligence as the next step towards precision pathology. J Intern Med. 2020;288(1):62-81. doi:10.1111/joim.13030

68. Mobadersany P, Yousefi S, Amgad M, et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A. 2018;115(13):E2970-E2979. doi:10.1073/pnas.1717139115

69. Wang X, Janowczyk A, Zhou Y, et al. Prediction of recurrence in early stage non-small cell lung cancer using computer extracted nuclear features from digital H&E images. Sci Rep. 2017;7:13543. doi:10.1038/s41598-017-13773-7

70. Kulkarni PM, Robinson EJ, Pradhan JS, et al. Deep learning based on standard H&E images of primary melanoma tumors identifies patients at risk for visceral recurrence and death. Clin Cancer Res. 2020;26(5):1126-1134. doi:10.1158/1078-0432.CCR-19-1495

71. Du XL, Li WB, Hu BJ. Application of artificial intelligence in ophthalmology. Int J Ophthalmol. 2018;11(9):1555-1561. doi:10.18240/ijo.2018.09.21

72. Gunasekeran DV, Wong TY. Artificial intelligence in ophthalmology in 2020: a technology on the cusp for translation and implementation. Asia Pac J Ophthalmol (Phila). 2020;9(2):61-66. doi:10.1097/01.APO.0000656984.56467.2c

73. Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167-175. doi:10.1136/bjophthalmol-2018-313173

74. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2410. doi:10.1001/jama.2016.17216

75. US Food and Drug Administration. FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems [press release]. Published April 11, 2018. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-permits-marketing-artificial-intelligence-based-device-detect-certain-diabetes-related-eye

76. Long E, Chen J, Wu X, et al. Artificial intelligence manages congenital cataract with individualized prediction and telehealth computing. NPJ Digit Med. 2020;3:112. doi:10.1038/s41746-020-00319-x

77. De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med. 2018;24(9):1342-1350. doi:10.1038/s41591-018-0107-6

78. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056

79. Brinker TJ, Hekler A, Enk AH, et al. Deep neural networks are superior to dermatologists in melanoma image classification. Eur J Cancer. 2019;119:11-17. doi:10.1016/j.ejca.2019.05.023

80. Brinker TJ, Hekler A, Enk AH, et al. A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task. Eur J Cancer. 2019;111:148-154. doi:10.1016/j.ejca.2019.02.005

81. Haenssle HA, Fink C, Schneiderbauer R, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018;29(8):1836-1842. doi:10.1093/annonc/mdy166

82. Li CX, Shen CB, Xue K, et al. Artificial intelligence in dermatology: past, present, and future. Chin Med J (Engl). 2019;132(17):2017-2020. doi:10.1097/CM9.0000000000000372

83. Tschandl P, Codella N, Akay BN, et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20(7):938-947. doi:10.1016/S1470-2045(19)30333-X

84. Han SS, Park I, Eun Chang SE, et al. Augmented intelligence dermatology: deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders. J Invest Dermatol. 2020;140(9):1753-1761. doi:10.1016/j.jid.2020.01.019

85. Freeman K, Dinnes J, Chuchu N, et al. Algorithm based smartphone apps to assess risk of skin cancer in adults: systematic review of diagnostic accuracy studies [published correction appears in BMJ. 2020 Feb 25;368:m645]. BMJ. 2020;368:m127. Published 2020 Feb 10. doi:10.1136/bmj.m127

86. Chen YC, Ke WC, Chiu HW. Risk classification of cancer survival using ANN with gene expression data from multiple laboratories. Comput Biol Med. 2014;48:1-7. doi:10.1016/j.compbiomed.2014.02.006

87. Kim W, Kim KS, Lee JE, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15(2):230-238. doi:10.4048/jbc.2012.15.2.230

88. Merath K, Hyer JM, Mehta R, et al. Use of machine learning for prediction of patient risk of postoperative complications after liver, pancreatic, and colorectal surgery. J Gastrointest Surg. 2020;24(8):1843-1851. doi:10.1007/s11605-019-04338-2

89. Santos-García G, Varela G, Novoa N, Jiménez MF. Prediction of postoperative morbidity after lung resection using an artificial neural network ensemble. Artif Intell Med. 2004;30(1):61-69. doi:10.1016/S0933-3657(03)00059-9

90. Ibragimov B, Xing L. Segmentation of organs-at-risks in head and neck CT images using convolutional neural networks. Med Phys. 2017;44(2):547-557. doi:10.1002/mp.12045

91. Lou B, Doken S, Zhuang T, et al. An image-based deep learning framework for individualizing radiotherapy dose. Lancet Digit Health. 2019;1(3):e136-e147. doi:10.1016/S2589-7500(19)30058-5

92. Xu J, Yang P, Xue S, et al. Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives. Hum Genet. 2019;138(2):109-124. doi:10.1007/s00439-019-01970-5

93. Patel NM, Michelini VV, Snell JM, et al. Enhancing next‐generation sequencing‐guided cancer care through cognitive computing. Oncologist. 2018;23(2):179-185. doi:10.1634/theoncologist.2017-0170

94. Le Berre C, Sandborn WJ, Aridhi S, et al. Application of artificial intelligence to gastroenterology and hepatology. Gastroenterology. 2020;158(1):76-94.e2. doi:10.1053/j.gastro.2019.08.058

95. Yang YJ, Bang CS. Application of artificial intelligence in gastroenterology. World J Gastroenterol. 2019;25(14):1666-1683. doi:10.3748/wjg.v25.i14.1666

96. Wang P, Berzin TM, Glissen Brown JR, et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019;68(10):1813-1819. doi:10.1136/gutjnl-2018-317500

<--pagebreak-->

97. Gupta R, Krishnam SP, Schaefer PW, Lev MH, Gonzalez RG. An East Coast perspective on artificial intelligence and machine learning: part 2: ischemic stroke imaging and triage. Neuroimaging Clin N Am. 2020;30(4):467-478. doi:10.1016/j.nic.2020.08.002

98. Beli M, Bobi V, Badža M, Šolaja N, Duri-Jovii M, Kosti VS. Artificial intelligence for assisting diagnostics and assessment of Parkinson’s disease—a review. Clin Neurol Neurosurg. 2019;184:105442. doi:10.1016/j.clineuro.2019.105442

99. An S, Kang C, Lee HW. Artificial intelligence and computational approaches for epilepsy. J Epilepsy Res. 2020;10(1):8-17. doi:10.14581/jer.20003

100. Pavel AM, Rennie JM, de Vries LS, et al. A machine-learning algorithm for neonatal seizure recognition: a multicentre, randomised, controlled trial. Lancet Child Adolesc Health. 2020;4(10):740-749. doi:10.1016/S2352-4642(20)30239-X

101. Afzal HMR, Luo S, Ramadan S, Lechner-Scott J. The emerging role of artificial intelligence in multiple sclerosis imaging [published online ahead of print, 2020 Oct 28]. Mult Scler. 2020;1352458520966298. doi:10.1177/1352458520966298

102. Bouton CE. Restoring movement in paralysis with a bioelectronic neural bypass approach: current state and future directions. Cold Spring Harb Perspect Med. 2019;9(11):a034306. doi:10.1101/cshperspect.a034306

103. Durstewitz D, Koppe G, Meyer-Lindenberg A. Deep neural networks in psychiatry. Mol Psychiatry. 2019;24(11):1583-1598. doi:10.1038/s41380-019-0365-9

104. Fonseka TM, Bhat V, Kennedy SH. The utility of artificial intelligence in suicide risk prediction and the management of suicidal behaviors. Aust N Z J Psychiatry. 2019;53(10):954-964. doi:10.1177/0004867419864428

105. Kessler RC, Hwang I, Hoffmire CA, et al. Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans Health Administration. Int J Methods Psychiatr Res. 2017;26(3):e1575. doi:10.1002/mpr.1575

106. Kessler RC, Bauer MS, Bishop TM, et al. Using administrative data to predict suicide after psychiatric hospitalization in the Veterans Health Administration System. Front Psychiatry. 2020;11:390. doi:10.3389/fpsyt.2020.00390

107. Kessler RC, van Loo HM, Wardenaar KJ, et al. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry. 2016;21(10):1366-1371. doi:10.1038/mp.2015.198

108. Horng S, Sontag DA, Halpern Y, Jernite Y, Shapiro NI, Nathanson LA. Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning. PLoS One. 2017;12(4):e0174708. doi:10.1371/journal.pone.0174708

109. Soffer S, Klang E, Barash Y, Grossman E, Zimlichman E. Predicting in-hospital mortality at admission to the medical ward: a big-data machine learning model. Am J Med. 2021;134(2):227-234.e4. doi:10.1016/j.amjmed.2020.07.014

110. Labovitz DL, Shafner L, Reyes Gil M, Virmani D, Hanina A. Using artificial intelligence to reduce the risk of nonadherence in patients on anticoagulation therapy. Stroke. 2017;48(5):1416-1419. doi:10.1161/STROKEAHA.116.016281

111. Forlenza GP. Use of artificial intelligence to improve diabetes outcomes in patients using multiple daily injections therapy. Diabetes Technol Ther. 2019;21(S2):S24-S28. doi:10.1089/dia.2019.0077

112. Poser CM. CT scan and the practice of neurology. Arch Neurol. 1977;34(2):132. doi:10.1001/archneur.1977.00500140086023

113. Angus DC. Randomized clinical trials of artificial intelligence. JAMA. 2020;323(11):1043-1045. doi:10.1001/jama.2020.1039

114. Topol EJ. Welcoming new guidelines for AI clinical research. Nat Med. 2020;26(9):1318-1320. doi:10.1038/s41591-020-1042-x

115. Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. Lancet. 2019;393(10181):1577-1579. doi:10.1016/S0140-6736(19)30037-6

116. Cruz Rivera S, Liu X, Chan AW, et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat Med. 2020;26(9):1351-1363. doi:10.1038/s41591-020-1037-7

117. Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK; SPIRIT-AI and CONSORT-AI Working Group. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med. 2020;26(9):1364-1374. doi:10.1038/s41591-020-1034-x

118. McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5(4):115-133. doi:10.1007/BF02478259

119. Samuel AL. Some studies in machine learning using the game of Checkers. IBM J Res Dev. 1959;3(3):535-554. Accessed September 15, 2021. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.2254

120. Sonoda M, Takano M, Miyahara J, Kato H. Computed radiography utilizing scanning laser stimulated luminescence. Radiology. 1983;148(3):833-838. doi:10.1148/radiology.148.3.6878707

121. Dechter R. Learning while searching in constraint-satisfaction-problems. AAAI’86: proceedings of the fifth AAAI national conference on artificial intelligence. Published 1986. Accessed September 15, 2021. https://www.aaai.org/Papers/AAAI/1986/AAAI86-029.pdf

122. Le Cun Y, Jackel LD, Boser B, et al. Handwritten digit recognition: applications of neural network chips and automatic learning. IEEE Commun Mag. 1989;27(11):41-46. doi:10.1109/35.41400

123. US Food and Drug Administration. FDA allows marketing of first whole slide imaging system for digital pathology [press release]. Published April 12, 2017. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-allows-marketing-first-whole-slide-imaging-system-digital-pathology

References

1. Bini SA. Artificial intelligence, machine learning, deep learning, and cognitive computing: what do these terms mean and how will they impact health care? J Arthroplasty. 2018;33(8):2358-2361. doi:10.1016/j.arth.2018.02.067

2. Benjamens S, Dhunnoo P, Meskó B. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. NPJ Digit Med. 2020;3:118. doi:10.1038/s41746-020-00324-0

3. Viz. AI powered synchronized stroke care. Accessed September 15, 2021. https://www.viz.ai/ischemic-stroke

4. Buchanan M. The law of accelerating returns. Nat Phys. 2008;4(7):507. doi:10.1038/nphys1010

5. IBM Watson Health computes a pair of new solutions to improve healthcare data and security. Published September 10, 2015. Accessed October 21, 2020. https://www.techrepublic.com/article/ibm-watson-health-computes-a-pair-of-new-solutions-to-improve-healthcare-data-and-security

6. Borkowski AA, Kardani A, Mastorides SM, Thomas LB. Warfarin pharmacogenomics: recommendations with available patented clinical technologies. Recent Pat Biotechnol. 2014;8(2):110-115. doi:10.2174/1872208309666140904112003

7. Washington University in St. Louis. Warfarin dosing. Accessed September 15, 2021. http://www.warfarindosing.org/Source/Home.aspx

8. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med. 2019;25(1):30-36. doi:10.1038/s41591-018-0307-0

9. Jiang F, Jiang Y, Zhi H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243. Published 2017 Jun 21. doi:10.1136/svn-2017-000101

10. Johnson KW, Torres Soto J, Glicksberg BS, et al. Artificial intelligence in cardiology. J Am Coll Cardiol. 2018;71(23):2668-2679. doi:10.1016/j.jacc.2018.03.521

11. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.

12. Cruz-Roa A, Gilmore H, Basavanhally A, et al. High-throughput adaptive sampling for whole-slide histopathology image analysis (HASHI) via convolutional neural networks: application to invasive breast cancer detection. PLoS One. 2018;13(5):e0196828. Published 2018 May 24. doi:10.1371/journal.pone.0196828

13. Nagendran M, Chen Y, Lovejoy CA, et al. Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies. BMJ. 2020;368:m689. Published 2020 Mar 25. doi:10.1136/bmj.m689

14. Shimizu H, Nakayama KI. Artificial intelligence in oncology. Cancer Sci. 2020;111(5):1452-1460. doi:10.1111/cas.14377

15. Talebi-Liasi F, Markowitz O. Is artificial intelligence going to replace dermatologists?. Cutis. 2020;105(1):28-31.

16. Valliani AA, Ranti D, Oermann EK. Deep learning and neurology: a systematic review. Neurol Ther. 2019;8(2):351-365. doi:10.1007/s40120-019-00153-8

17. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539

18. Graham S, Depp C, Lee EE, et al. Artificial intelligence for mental health and mental illnesses: an overview. Curr Psychiatry Rep. 2019;21(11):116. Published 2019 Nov 7. doi:10.1007/s11920-019-1094-0

19. Golas SB, Shibahara T, Agboola S, et al. A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data. BMC Med Inform Decis Mak. 2018;18(1):44. Published 2018 Jun 22. doi:10.1186/s12911-018-0620-z

20. Mortazavi BJ, Downing NS, Bucholz EM, et al. Analysis of machine learning techniques for heart failure readmissions. Circ Cardiovasc Qual Outcomes. 2016;9(6):629-640. doi:10.1161/CIRCOUTCOMES.116.003039

21. Meyer-Bäse A, Morra L, Meyer-Bäse U, Pinker K. Current status and future perspectives of artificial intelligence in magnetic resonance breast imaging. Contrast Media Mol Imaging. 2020;2020:6805710. Published 2020 Aug 28. doi:10.1155/2020/6805710

22. Ehteshami Bejnordi B, Veta M, Johannes van Diest P, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210. doi:10.1001/jama.2017.14585

23. Borkowski AA, Viswanadhan NA, Thomas LB, Guzman RD, Deland LA, Mastorides SM. Using artificial intelligence for COVID-19 chest X-ray diagnosis. Fed Pract. 2020;37(9):398-404. doi:10.12788/fp.0045

24. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567. doi:10.1038/s41591-018-0177-5

25. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010

26. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Health. 2019;1(6):e271-e297. doi:10.1016/S2589-7500(19)30123-2

27. Nagpal K, Foote D, Liu Y, et al. Development and validation of a deep learning algorithm for improving Gleason scoring of prostate cancer [published correction appears in NPJ Digit Med. 2019 Nov 19;2:113]. NPJ Digit Med. 2019;2:48. Published 2019 Jun 7. doi:10.1038/s41746-019-0112-2

28. Nam JG, Park S, Hwang EJ, et al. Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019;290(1):218-228. doi:10.1148/radiol.2018180237

29. Schwalbe N, Wahl B. Artificial intelligence and the future of global health. Lancet. 2020;395(10236):1579-1586. doi:10.1016/S0140-6736(20)30226-9

30. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT [published correction appears in Radiology. 2021 Apr;299(1):E225]. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491

31. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905

32. Serag A, Ion-Margineanu A, Qureshi H, et al. Translational AI and deep learning in diagnostic pathology. Front Med (Lausanne). 2019;6:185. Published 2019 Oct 1. doi:10.3389/fmed.2019.00185

<--pagebreak-->

33. Wang D, Khosla A, Gargeya R, Irshad H, Beck AH. Deep learning for identifying metastatic breast cancer. ArXiv. 2016 June 18:arXiv:1606.05718v1. Published online June 18, 2016. Accessed September 15, 2021. http://arxiv.org/abs/1606.05718

34. Alabdulkareem A. Artificial intelligence and dermatologists: friends or foes? J Dermatology Dermatol Surg. 2019;23(2):57-60. doi:10.4103/jdds.jdds_19_19

35. Mollalo A, Mao L, Rashidi P, Glass GE. A GIS-based artificial neural network model for spatial distribution of tuberculosis across the continental United States. Int J Environ Res Public Health. 2019;16(1):157. Published 2019 Jan 8. doi:10.3390/ijerph16010157

36. Haddawy P, Hasan AHMI, Kasantikul R, et al. Spatiotemporal Bayesian networks for malaria prediction. Artif Intell Med. 2018;84:127-138. doi:10.1016/j.artmed.2017.12.002

37. Laureano-Rosario AE, Duncan AP, Mendez-Lazaro PA, et al. Application of artificial neural networks for dengue fever outbreak predictions in the northwest coast of Yucatan, Mexico and San Juan, Puerto Rico. Trop Med Infect Dis. 2018;3(1):5. Published 2018 Jan 5. doi:10.3390/tropicalmed3010005

38. Buczak AL, Koshute PT, Babin SM, Feighner BH, Lewis SH. A data-driven epidemiological prediction method for dengue outbreaks using local and remote sensing data. BMC Med Inform Decis Mak. 2012;12:124. Published 2012 Nov 5. doi:10.1186/1472-6947-12-124

39. Scavuzzo JM, Trucco F, Espinosa M, et al. Modeling dengue vector population using remotely sensed data and machine learning. Acta Trop. 2018;185:167-175. doi:10.1016/j.actatropica.2018.05.003

40. Xue H, Bai Y, Hu H, Liang H. Influenza activity surveillance based on multiple regression model and artificial neural network. IEEE Access. 2018;6:563-575. doi:10.1109/ACCESS.2017.2771798

41. Jiang D, Hao M, Ding F, Fu J, Li M. Mapping the transmission risk of Zika virus using machine learning models. Acta Trop. 2018;185:391-399. doi:10.1016/j.actatropica.2018.06.021

42. Bragazzi NL, Dai H, Damiani G, Behzadifar M, Martini M, Wu J. How big data and artificial intelligence can help better manage the COVID-19 pandemic. Int J Environ Res Public Health. 2020;17(9):3176. Published 2020 May 2. doi:10.3390/ijerph17093176

43. Lake IR, Colón-González FJ, Barker GC, Morbey RA, Smith GE, Elliot AJ. Machine learning to refine decision making within a syndromic surveillance service. BMC Public Health. 2019;19(1):559. Published 2019 May 14. doi:10.1186/s12889-019-6916-9

44. Khan OF, Bebb G, Alimohamed NA. Artificial intelligence in medicine: what oncologists need to know about its potential-and its limitations. Oncol Exch. 2017;16(4):8-13. Accessed September 1, 2021. http://www.oncologyex.com/pdf/vol16_no4/feature_khan-ai.pdf

45. Badano LP, Keller DM, Muraru D, Torlasco C, Parati G. Artificial intelligence and cardiovascular imaging: A win-win combination. Anatol J Cardiol. 2020;24(4):214-223. doi:10.14744/AnatolJCardiol.2020.94491

46. Murdoch TB, Detsky AS. The inevitable application of big data to health care. JAMA. 2013;309(13):1351-1352. doi:10.1001/jama.2013.393

47. Greatbatch O, Garrett A, Snape K. The impact of artificial intelligence on the current and future practice of clinical cancer genomics. Genet Res (Camb). 2019;101:e9. Published 2019 Oct 31. doi:10.1017/S0016672319000089

48. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44-56. doi:10.1038/s41591-018-0300-7

49. Vollmer S, Mateen BA, Bohner G, et al. Machine learning and artificial intelligence research for patient benefit: 20 critical questions on transparency, replicability, ethics, and effectiveness [published correction appears in BMJ. 2020 Apr 1;369:m1312]. BMJ. 2020;368:l6927. Published 2020 Mar 20. doi:10.1136/bmj.l6927

50. Lindsey R, Daluiski A, Chopra S, et al. Deep neural network improves fracture detection by clinicians. Proc Natl Acad Sci U S A. 2018;115(45):11591-11596. doi:10.1073/pnas.1806905115

51. Zech JR, Badgeley MA, Liu M, Costa AB, Titano JJ, Oermann EK. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 2018;15(11):e1002683. doi:10.1371/journal.pmed.1002683

52. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582. doi:10.1148/radiol.2017162326

53. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. ArXiv. 2020 Feb 26:arXiv:2002.11379v2. Revised March 11, 2020. Accessed September 15, 2021. http://arxiv.org/abs/2002.11379

54. Salim M, Wåhlin E, Dembrower K, et al. External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol. 2020;6(10):1581-1588. doi:10.1001/jamaoncol.2020.3321

55. Arbabshirani MR, Fornwalt BK, Mongelluzzo GJ, et al. Advanced machine learning in action: identification of intracranial hemorrhage on computed tomography scans of the head with clinical workflow integration. NPJ Digit Med. 2018;1:9. doi:10.1038/s41746-017-0015-z

56. Sheth D, Giger ML. Artificial intelligence in the interpretation of breast cancer on MRI. J Magn Reson Imaging. 2020;51(5):1310-1324. doi:10.1002/jmri.26878

57. McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI system for breast cancer screening. Nature. 2020;577(7788):89-94. doi:10.1038/s41586-019-1799-6

58. Booth AL, Abels E, McCaffrey P. Development of a prognostic model for mortality in COVID-19 infection using machine learning. Mod Pathol. 2021;34(3):522-531. doi:10.1038/s41379-020-00700-x

59. Xu B, Kocyigit D, Grimm R, Griffin BP, Cheng F. Applications of artificial intelligence in multimodality cardiovascular imaging: a state-of-the-art review. Prog Cardiovasc Dis. 2020;63(3):367-376. doi:10.1016/j.pcad.2020.03.003

60. Dey D, Slomka PJ, Leeson P, et al. Artificial intelligence in cardiovascular imaging: JACC state-of-the-art review. J Am Coll Cardiol. 2019;73(11):1317-1335. doi:10.1016/j.jacc.2018.12.054

61. Carewell Health. AI powered ECG diagnosis solutions. Accessed November 2, 2020. https://www.carewellhealth.com/products_aiecg.html

62. Strodthoff N, Strodthoff C. Detecting and interpreting myocardial infarction using fully convolutional neural networks. Physiol Meas. 2019;40(1):015001. doi:10.1088/1361-6579/aaf34d

63. Hannun AY, Rajpurkar P, Haghpanahi M, et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25(1):65-69. doi:10.1038/s41591-018-0268-3

64. Kwon JM, Jeon KH, Kim HM, et al. Comparing the performance of artificial intelligence and conventional diagnosis criteria for detecting left ventricular hypertrophy using electrocardiography. Europace. 2020;22(3):412-419. doi:10.1093/europace/euz324

<--pagebreak-->

65. Eko. FDA clears Eko’s AFib and heart murmur detection algorithms, making it the first AI-powered stethoscope to screen for serious heart conditions [press release]. Published January 28, 2020. Accessed September 15, 2021. https://www.businesswire.com/news/home/20200128005232/en/FDA-Clears-Eko’s-AFib-and-Heart-Murmur-Detection-Algorithms-Making-It-the-First-AI-Powered-Stethoscope-to-Screen-for-Serious-Heart-Conditions

66. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450. doi:10.1038/srep46450

67. Acs B, Rantalainen M, Hartman J. Artificial intelligence as the next step towards precision pathology. J Intern Med. 2020;288(1):62-81. doi:10.1111/joim.13030

68. Mobadersany P, Yousefi S, Amgad M, et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc Natl Acad Sci U S A. 2018;115(13):E2970-E2979. doi:10.1073/pnas.1717139115

69. Wang X, Janowczyk A, Zhou Y, et al. Prediction of recurrence in early stage non-small cell lung cancer using computer extracted nuclear features from digital H&E images. Sci Rep. 2017;7:13543. doi:10.1038/s41598-017-13773-7

70. Kulkarni PM, Robinson EJ, Pradhan JS, et al. Deep learning based on standard H&E images of primary melanoma tumors identifies patients at risk for visceral recurrence and death. Clin Cancer Res. 2020;26(5):1126-1134. doi:10.1158/1078-0432.CCR-19-1495

71. Du XL, Li WB, Hu BJ. Application of artificial intelligence in ophthalmology. Int J Ophthalmol. 2018;11(9):1555-1561. doi:10.18240/ijo.2018.09.21

72. Gunasekeran DV, Wong TY. Artificial intelligence in ophthalmology in 2020: a technology on the cusp for translation and implementation. Asia Pac J Ophthalmol (Phila). 2020;9(2):61-66. doi:10.1097/01.APO.0000656984.56467.2c

73. Ting DSW, Pasquale LR, Peng L, et al. Artificial intelligence and deep learning in ophthalmology. Br J Ophthalmol. 2019;103(2):167-175. doi:10.1136/bjophthalmol-2018-313173

74. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2410. doi:10.1001/jama.2016.17216

75. US Food and Drug Administration. FDA permits marketing of artificial intelligence-based device to detect certain diabetes-related eye problems [press release]. Published April 11, 2018. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-permits-marketing-artificial-intelligence-based-device-detect-certain-diabetes-related-eye

76. Long E, Chen J, Wu X, et al. Artificial intelligence manages congenital cataract with individualized prediction and telehealth computing. NPJ Digit Med. 2020;3:112. doi:10.1038/s41746-020-00319-x

77. De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med. 2018;24(9):1342-1350. doi:10.1038/s41591-018-0107-6

78. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118. doi:10.1038/nature21056

79. Brinker TJ, Hekler A, Enk AH, et al. Deep neural networks are superior to dermatologists in melanoma image classification. Eur J Cancer. 2019;119:11-17. doi:10.1016/j.ejca.2019.05.023

80. Brinker TJ, Hekler A, Enk AH, et al. A convolutional neural network trained with dermoscopic images performed on par with 145 dermatologists in a clinical melanoma image classification task. Eur J Cancer. 2019;111:148-154. doi:10.1016/j.ejca.2019.02.005

81. Haenssle HA, Fink C, Schneiderbauer R, et al. Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol. 2018;29(8):1836-1842. doi:10.1093/annonc/mdy166

82. Li CX, Shen CB, Xue K, et al. Artificial intelligence in dermatology: past, present, and future. Chin Med J (Engl). 2019;132(17):2017-2020. doi:10.1097/CM9.0000000000000372

83. Tschandl P, Codella N, Akay BN, et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20(7):938-947. doi:10.1016/S1470-2045(19)30333-X

84. Han SS, Park I, Eun Chang SE, et al. Augmented intelligence dermatology: deep neural networks empower medical professionals in diagnosing skin cancer and predicting treatment options for 134 skin disorders. J Invest Dermatol. 2020;140(9):1753-1761. doi:10.1016/j.jid.2020.01.019

85. Freeman K, Dinnes J, Chuchu N, et al. Algorithm based smartphone apps to assess risk of skin cancer in adults: systematic review of diagnostic accuracy studies [published correction appears in BMJ. 2020 Feb 25;368:m645]. BMJ. 2020;368:m127. Published 2020 Feb 10. doi:10.1136/bmj.m127

86. Chen YC, Ke WC, Chiu HW. Risk classification of cancer survival using ANN with gene expression data from multiple laboratories. Comput Biol Med. 2014;48:1-7. doi:10.1016/j.compbiomed.2014.02.006

87. Kim W, Kim KS, Lee JE, et al. Development of novel breast cancer recurrence prediction model using support vector machine. J Breast Cancer. 2012;15(2):230-238. doi:10.4048/jbc.2012.15.2.230

88. Merath K, Hyer JM, Mehta R, et al. Use of machine learning for prediction of patient risk of postoperative complications after liver, pancreatic, and colorectal surgery. J Gastrointest Surg. 2020;24(8):1843-1851. doi:10.1007/s11605-019-04338-2

89. Santos-García G, Varela G, Novoa N, Jiménez MF. Prediction of postoperative morbidity after lung resection using an artificial neural network ensemble. Artif Intell Med. 2004;30(1):61-69. doi:10.1016/S0933-3657(03)00059-9

90. Ibragimov B, Xing L. Segmentation of organs-at-risks in head and neck CT images using convolutional neural networks. Med Phys. 2017;44(2):547-557. doi:10.1002/mp.12045

91. Lou B, Doken S, Zhuang T, et al. An image-based deep learning framework for individualizing radiotherapy dose. Lancet Digit Health. 2019;1(3):e136-e147. doi:10.1016/S2589-7500(19)30058-5

92. Xu J, Yang P, Xue S, et al. Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives. Hum Genet. 2019;138(2):109-124. doi:10.1007/s00439-019-01970-5

93. Patel NM, Michelini VV, Snell JM, et al. Enhancing next‐generation sequencing‐guided cancer care through cognitive computing. Oncologist. 2018;23(2):179-185. doi:10.1634/theoncologist.2017-0170

94. Le Berre C, Sandborn WJ, Aridhi S, et al. Application of artificial intelligence to gastroenterology and hepatology. Gastroenterology. 2020;158(1):76-94.e2. doi:10.1053/j.gastro.2019.08.058

95. Yang YJ, Bang CS. Application of artificial intelligence in gastroenterology. World J Gastroenterol. 2019;25(14):1666-1683. doi:10.3748/wjg.v25.i14.1666

96. Wang P, Berzin TM, Glissen Brown JR, et al. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut. 2019;68(10):1813-1819. doi:10.1136/gutjnl-2018-317500

<--pagebreak-->

97. Gupta R, Krishnam SP, Schaefer PW, Lev MH, Gonzalez RG. An East Coast perspective on artificial intelligence and machine learning: part 2: ischemic stroke imaging and triage. Neuroimaging Clin N Am. 2020;30(4):467-478. doi:10.1016/j.nic.2020.08.002

98. Beli M, Bobi V, Badža M, Šolaja N, Duri-Jovii M, Kosti VS. Artificial intelligence for assisting diagnostics and assessment of Parkinson’s disease—a review. Clin Neurol Neurosurg. 2019;184:105442. doi:10.1016/j.clineuro.2019.105442

99. An S, Kang C, Lee HW. Artificial intelligence and computational approaches for epilepsy. J Epilepsy Res. 2020;10(1):8-17. doi:10.14581/jer.20003

100. Pavel AM, Rennie JM, de Vries LS, et al. A machine-learning algorithm for neonatal seizure recognition: a multicentre, randomised, controlled trial. Lancet Child Adolesc Health. 2020;4(10):740-749. doi:10.1016/S2352-4642(20)30239-X

101. Afzal HMR, Luo S, Ramadan S, Lechner-Scott J. The emerging role of artificial intelligence in multiple sclerosis imaging [published online ahead of print, 2020 Oct 28]. Mult Scler. 2020;1352458520966298. doi:10.1177/1352458520966298

102. Bouton CE. Restoring movement in paralysis with a bioelectronic neural bypass approach: current state and future directions. Cold Spring Harb Perspect Med. 2019;9(11):a034306. doi:10.1101/cshperspect.a034306

103. Durstewitz D, Koppe G, Meyer-Lindenberg A. Deep neural networks in psychiatry. Mol Psychiatry. 2019;24(11):1583-1598. doi:10.1038/s41380-019-0365-9

104. Fonseka TM, Bhat V, Kennedy SH. The utility of artificial intelligence in suicide risk prediction and the management of suicidal behaviors. Aust N Z J Psychiatry. 2019;53(10):954-964. doi:10.1177/0004867419864428

105. Kessler RC, Hwang I, Hoffmire CA, et al. Developing a practical suicide risk prediction model for targeting high-risk patients in the Veterans Health Administration. Int J Methods Psychiatr Res. 2017;26(3):e1575. doi:10.1002/mpr.1575

106. Kessler RC, Bauer MS, Bishop TM, et al. Using administrative data to predict suicide after psychiatric hospitalization in the Veterans Health Administration System. Front Psychiatry. 2020;11:390. doi:10.3389/fpsyt.2020.00390

107. Kessler RC, van Loo HM, Wardenaar KJ, et al. Testing a machine-learning algorithm to predict the persistence and severity of major depressive disorder from baseline self-reports. Mol Psychiatry. 2016;21(10):1366-1371. doi:10.1038/mp.2015.198

108. Horng S, Sontag DA, Halpern Y, Jernite Y, Shapiro NI, Nathanson LA. Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning. PLoS One. 2017;12(4):e0174708. doi:10.1371/journal.pone.0174708

109. Soffer S, Klang E, Barash Y, Grossman E, Zimlichman E. Predicting in-hospital mortality at admission to the medical ward: a big-data machine learning model. Am J Med. 2021;134(2):227-234.e4. doi:10.1016/j.amjmed.2020.07.014

110. Labovitz DL, Shafner L, Reyes Gil M, Virmani D, Hanina A. Using artificial intelligence to reduce the risk of nonadherence in patients on anticoagulation therapy. Stroke. 2017;48(5):1416-1419. doi:10.1161/STROKEAHA.116.016281

111. Forlenza GP. Use of artificial intelligence to improve diabetes outcomes in patients using multiple daily injections therapy. Diabetes Technol Ther. 2019;21(S2):S24-S28. doi:10.1089/dia.2019.0077

112. Poser CM. CT scan and the practice of neurology. Arch Neurol. 1977;34(2):132. doi:10.1001/archneur.1977.00500140086023

113. Angus DC. Randomized clinical trials of artificial intelligence. JAMA. 2020;323(11):1043-1045. doi:10.1001/jama.2020.1039

114. Topol EJ. Welcoming new guidelines for AI clinical research. Nat Med. 2020;26(9):1318-1320. doi:10.1038/s41591-020-1042-x

115. Collins GS, Moons KGM. Reporting of artificial intelligence prediction models. Lancet. 2019;393(10181):1577-1579. doi:10.1016/S0140-6736(19)30037-6

116. Cruz Rivera S, Liu X, Chan AW, et al. Guidelines for clinical trial protocols for interventions involving artificial intelligence: the SPIRIT-AI extension. Nat Med. 2020;26(9):1351-1363. doi:10.1038/s41591-020-1037-7

117. Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK; SPIRIT-AI and CONSORT-AI Working Group. Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. Nat Med. 2020;26(9):1364-1374. doi:10.1038/s41591-020-1034-x

118. McCulloch WS, Pitts W. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys. 1943;5(4):115-133. doi:10.1007/BF02478259

119. Samuel AL. Some studies in machine learning using the game of Checkers. IBM J Res Dev. 1959;3(3):535-554. Accessed September 15, 2021. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.368.2254

120. Sonoda M, Takano M, Miyahara J, Kato H. Computed radiography utilizing scanning laser stimulated luminescence. Radiology. 1983;148(3):833-838. doi:10.1148/radiology.148.3.6878707

121. Dechter R. Learning while searching in constraint-satisfaction-problems. AAAI’86: proceedings of the fifth AAAI national conference on artificial intelligence. Published 1986. Accessed September 15, 2021. https://www.aaai.org/Papers/AAAI/1986/AAAI86-029.pdf

122. Le Cun Y, Jackel LD, Boser B, et al. Handwritten digit recognition: applications of neural network chips and automatic learning. IEEE Commun Mag. 1989;27(11):41-46. doi:10.1109/35.41400

123. US Food and Drug Administration. FDA allows marketing of first whole slide imaging system for digital pathology [press release]. Published April 12, 2017. Accessed September 15, 2021. https://www.fda.gov/news-events/press-announcements/fda-allows-marketing-first-whole-slide-imaging-system-digital-pathology

Issue
Federal Practitioner - 38(11)a
Issue
Federal Practitioner - 38(11)a
Page Number
527-538
Page Number
527-538
Publications
Publications
Topics
Article Type
Sections
Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Conference Recap Checkbox
Not Conference Recap
Clinical Edge
Display the Slideshow in this Article
Medscape Article
Display survey writer
Reuters content
Disable Inline Native ads
WebMD Article

Using Artificial Intelligence for COVID-19 Chest X-ray Diagnosis

Article Type
Changed
Thu, 08/26/2021 - 16:00

The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARSCoV- 2), which causes the respiratory disease coronavirus disease-19 (COVID- 19), was first identified as a cluster of cases of pneumonia in Wuhan, Hubei Province of China on December 31, 2019.1 Within a month, the disease had spread significantly, leading the World Health Organization (WHO) to designate COVID-19 a public health emergency of international concern. On March 11, 2020, the WHO declared COVID-19 a global pandemic.2 As of August 18, 2020, the virus has infected > 21 million people, with > 750,000 deaths worldwide.3 The spread of COVID-19 has had a dramatic impact on social, economic, and health care issues throughout the world, which has been discussed elsewhere.4

Prior to the this century, members of the coronavirus family had minimal impact on human health.5 However, in the past 20 years, outbreaks have highlighted an emerging importance of coronaviruses in morbidity and mortality on a global scale. Although less prevalent than COVID-19, severe acute respiratory syndrome (SARS) in 2002 to 2003 and Middle East respiratory syndrome (MERS) in 2012 likely had higher mortality rates than the current pandemic.5 Based on this recent history, it is reasonable to assume that we will continue to see novel diseases with similar significant health and societal implications. The challenges presented to health care providers (HCPs) by such novel viral pathogens are numerous, including methods for rapid diagnosis, prevention, and treatment. In the current study, we focus on diagnosis issues, which were evident with COVID-19 with the time required to develop rapid and effective diagnostic modalities.

We have previously reported the utility of using artificial intelligence (AI) in the histopathologic diagnosis of cancer.6-8 AI was first described in 1956 and involves the field of computer science in which machines are trained to learn from experience.9 Machine learning (ML) is a subset of AI and is achieved by using mathematic models to compute sample datasets.10 Current ML employs deep learning with neural network algorithms, which can recognize patterns and achieve complex computational tasks often far quicker and with increased precision than can humans.11-13 In addition to applications in pathology, ML algorithms have both prognostic and diagnostic applications in multiple medical specialties, such as radiology, dermatology, ophthalmology, and cardiology.6 It is predicted that AI will impact almost every aspect of health care in the future.14

In this article, we examine the potential for AI to diagnose patients with COVID-19 pneumonia using chest radiographs (CXR) alone. This is done using Microsoft CustomVision (www.customvision.ai), a readily available, automated ML platform. Employing AI to both screen and diagnose emerging health emergencies such as COVID-19 has the potential to dramatically change how we approach medical care in the future. In addition, we describe the creation of a publicly available website (interknowlogy-covid-19 .azurewebsites.net) that could augment COVID-19 pneumonia CXR diagnosis.

Methods

For the training dataset, 103 CXR images of COVID-19 were downloaded from GitHub covid-chest-xray dataset.15 Five hundred images of non-COVID-19 pneumonia and 500 images of the normal lung were downloaded from the Kaggle RSNA Pneumonia Detection Challenge dataset.16 To balance the dataset, we expanded the COVID-19 dataset to 500 images by slight rotation (probability = 1, max rotation = 5) and zooming (probability = 0.5, percentage area = 0.9) of the original images using the Augmentor Python package.17

Validation Dataset

For the validation dataset 30 random CXR images were obtained from the US Department of Veterans Affairs (VA) PACS (picture archiving and communication system). This dataset included 10 CXR images from hospitalized patients with COVID-19, 10 CXR pneumonia images from patients without COVID-19, and 10 normal CXRs. COVID-19 diagnoses were confirmed with a positive test result from the Xpert Xpress SARS-CoV-2 polymerase chain reaction (PCR) platform.18

 

 

Microsoft Custom

Vision Microsoft CustomVision is an automated image classification and object detection system that is a part of Microsoft Azure Cognitive Services (azure.microsoft.com). It has a pay-as-you-go model with fees depending on the computing needs and usage. It offers a free trial to users for 2 initial projects. The service is online with an easy-to-follow graphical user interface. No coding skills are necessary.

fdp03709398_f1.png

We created a new classification project in CustomVision and chose a compact general domain for small size and easy export to TensorFlow. js model format. TensorFlow.js is a JavaScript library that enables dynamic download and execution of ML models. After the project was created, we proceeded to upload our image dataset. Each class was uploaded separately and tagged with the appropriate label (covid pneumonia, non-covid pneumonia, or normal lung). The system rejected 16 COVID-19 images as duplicates. The final CustomVision training dataset consisted of 484 images of COVID-19 pneumonia, 500 images of non-COVID-19 pneumonia, and 500 images of normal lungs. Once uploaded, CustomVision self-trains using the dataset upon initiating the program (Figure 1).

 

Website Creation

CustomVision was used to train the model. It can be used to execute the model continuously, or the model can be compacted and decoupled from CustomVision. In this case, the model was compacted and decoupled for use in an online application. An Angular online application was created with TensorFlow.js. Within a user’s web browser, the model is executed when an image of a CXR is submitted. Confidence values for each classification are returned. In this design, after the initial webpage and model is downloaded, the webpage no longer needs to access any server components and performs all operations in the browser. Although the solution works well on mobile phone browsers and in low bandwidth situations, the quality of predictions may depend on the browser and device used. At no time does an image get submitted to the cloud.

fdp03709398_f2.png

Result

Overall, our trained model showed 92.9% precision and recall. Precision and recall results for each label were 98.9% and 94.8%, respectively for COVID-19 pneumonia; 91.8% and 89%, respectively, for non- COVID-19 pneumonia; and 88.8% and 95%, respectively, for normal lung (Figure 2). Next, we proceeded to validate the training model on the VA data by making individual predictions on 30 images from the VA dataset. Our model performed well with 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value (Table).

fdp03709398_t.png

 

Discussion

We successfully demonstrated the potential of using AI algorithms in assessing CXRs for COVID-19. We first trained the CustomVision automated image classification and object detection system to differentiate cases of COVID-19 from pneumonia from other etiologies as well as normal lung CXRs. We then tested our model against known patients from the James A. Haley Veterans’ Hospital in Tampa, Florida. The program achieved 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value in differentiating the 3 scenarios. Using the trained ML model, we proceeded to create a website that could augment COVID-19 CXR diagnosis.19 The website works on mobile as well as desktop platforms. A health care provider can take a CXR photo with a mobile phone or upload the image file. The ML algorithm would provide the probability of COVID-19 pneumonia, non-COVID-19 pneumonia, or normal lung diagnosis (Figure 3).

fdp03709398_f3.png

Emerging diseases such as COVID-19 present numerous challenges to HCPs, governments, and businesses, as well as to individual members of society. As evidenced with COVID-19, the time from first recognition of an emerging pathogen to the development of methods for reliable diagnosis and treatment can be months, even with a concerted international effort. The gold standard for diagnosis of COVID-19 is by reverse transcriptase PCR (RT-PCR) technologies; however, early RT-PCR testing produced less than optimal results.20-22 Even after the development of reliable tests for detection, making test kits readily available to health care providers on an adequate scale presents an additional challenge as evident with COVID-19.

Use of X-ray vs Computed Tomography

The lack of availability of diagnostic RTPCR with COVID-19 initially placed increased reliability on presumptive diagnoses via imaging in some situations.23 Most of the literature evaluating radiographs of patients with COVID-19 focuses on chest computed tomography (CT) findings, with initial results suggesting CT was more accurate than early RT-PCR methodologies.21,22,24 The Radiological Society of North America Expert consensus statement on chest CT for COVID-19 states that CT findings can even precede positivity on RT-PCR in some cases.22 However, currently it does not recommend the use of CT scanning as a screening tool. Furthermore, the actual sensitivity and specificity of CT interpretation by radiologists for COVID-19 are unknown.22

 

 

Characteristic CT findings include ground-glass opacities (GGOs) and consolidation most commonly in the lung periphery, though a diffuse distribution was found in a minority of patients.21,23,25-27 Lomoro and colleagues recently summarized the CT findings from several reports that described abnormalities as most often bilateral and peripheral, subpleural, and affecting the lower lobes.26 Not surprisingly, CT appears more sensitive at detecting changes with COVID-19 than does CXR, with reports that a minority of patients exhibited CT changes before changes were visible on CXR.23,26

We focused our study on the potential of AI in the examination of CXRs in patients with COVID-19, as there are several limitations to the routine use of CT scans with conditions such as COVID-19. Aside from the more considerable time required to obtain CTs, there are issues with contamination of CT suites, sometimes requiring a dedicated COVID-19 CT scanner.23,28 The time constraints of decontamination or limited utilization of CT suites can delay or disrupt services for patients with and without COVID-19. Because of these factors, CXR may be a better resource to minimize the risk of infection to other patients. Also, accurate assessment of abnormalities on CXR for COVID-19 may identify patients in whom the CXR was performed for other purposes.23 CXR is more readily available than CT, especially in more remote or underdeveloped areas.28 Finally, as with CT, CXR abnormalities are reported to have appeared before RT-PCR tests became positive for a minority of patients.23

CXR findings described in patients with COVID-19 are similar to those of CT and include GGOs, consolidation, and hazy increased opacities.23,25,26,28,29 Like CT, the majority of patients who received CXR demonstrated greater involvement in the lower zones and peripherally.23,25,26,28,29 Most patients showed bilateral involvement. However, while these findings are common in patients with COVID-19, they are not specific and can be seen in other conditions, such as other viral pneumonia, bacterial pneumonia, injury from drug toxicity, inhalation injury, connective tissue disease, and idiopathic conditions.

Application of AI for COVID-19

Applications of AI in interpreting radiographs of various types are numerous, and extensive literature has been written on the topic.30 Using deep learning algorithms, AI has multiple possible roles to augment traditional radiograph interpretation. These include the potential for screening, triaging, and increasing the speed to render diagnoses. It also can provide a rapid “second opinion” to the radiologist to support the final interpretation. In areas with critical shortages of radiologists, AI potentially can be used to render the definitive diagnosis. In COVID- 19, imaging studies have been shown to correlate with disease severity and mortality, and AI could assist in monitoring the course of the disease as it progresses and potentially identify patients at greatest risk.27 Furthermore, early results from PCR have been considered suboptimal, and it is known that patients with COVID-19 can test negative initially even by reliable testing methodologies. As AI technology progresses, interpretation can detect and guide triage and treatment of patients with high suspicions of COVID-19 but negative initial PCR results, or in situations where test availability is limited or results are delayed. There are numerous potential benefits should a rapid diagnostic test as simple as a CXR be able to reliably impact containment and prevention of the spread of contagions such as COVID- 19 early in its course.

Few studies have assessed using AI in the radiologic diagnosis of COVID-19, most of which use CT scanning. Bai and colleagues demonstrated increased accuracy, sensitivity, and specificity in distinguishing chest CTs of COVID-19 patients from other types of pneumonia.21,31 A separate study demonstrated the utility of using AI to differentiate COVID-19 from community-acquired pneumonia with CT.32 However, the effective utility of AI for CXR interpretation also has been demonstrated.14,33 Implementation of convolutional neural network layers has allowed for reliable differentiation of viral and bacterial pneumonia with CXR imaging.34 Evidence suggests that there is great potential in the application of AI in the interpretation of radiographs of all types.

Finally, we have developed a publicly available website based on our studies.18 This website is for research use only as it is based on data from our preliminary investigation. To appear within the website, images must have protected health information removed before uploading. The information on the website, including text, graphics, images, or other material, is for research and may not be appropriate for all circumstances. The website does not provide medical, professional, or licensed advice and is not a substitute for consultation with a HCP. Medical advice should be sought from a qualified HCP for any questions, and the website should not be used for medical diagnosis or treatment.

 

 

Limitations

In our preliminary study, we have demonstrated the potential impact AI can have in multiple aspects of patient care for emerging pathogens such as COVID-19 using a test as readily available as a CXR. However, several limitations to this investigation should be mentioned. The study is retrospective in nature with limited sample size and with X-rays from patients with various stages of COVID-19 pneumonia. Also, cases of non-COVID-19 pneumonia are not stratified into different types or etiologies. We intend to demonstrate the potential of AI in differentiating COVID-19 pneumonia from non-COVID-19 pneumonia of any etiology, though future studies should address comparison of COVID-19 cases to more specific types of pneumonias, such as of bacterial or viral origin. Furthermore, the present study does not address any potential effects of additional radiographic findings from coexistent conditions, such as pulmonary edema as seen in congestive heart failure, pleural effusions (which can be seen with COVID-19 pneumonia, though rarely), interstitial lung disease, etc. Future studies are required to address these issues. Ultimately, prospective studies to assess AI-assisted radiographic interpretation in conditions such as COVID-19 are required to demonstrate the impact on diagnosis, treatment, outcome, and patient safety as these technologies are implemented.

Conclusions

We have used a readily available, commercial platform to demonstrate the potential of AI to assist in the successful diagnosis of COVID-19 pneumonia on CXR images. While this technology has numerous applications in radiology, we have focused on the potential impact on future world health crises such as COVID-19. The findings have implications for screening and triage, initial diagnosis, monitoring disease progression, and identifying patients at increased risk of morbidity and mortality. Based on the data, a website was created to demonstrate how such technologies could be shared and distributed to others to combat entities such as COVID-19 moving forward. Our study offers a small window into the potential for how AI will likely dramatically change the practice of medicine in the future.

References

1. World Health Organization. Coronavirus disease (COVID- 19) pandemic. https://www.who.int/emergencies/diseases /novel-coronavirus2019. Updated August 23, 2020. Accessed August 24, 2020.

2. World Health Organization. WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. https://www.who.int/dg/speeches/detail/who -director-general-sopening-remarks-at-the-media-briefing -on-covid-19---11-march2020. Published March 11, 2020. Accessed August 24, 2020.

3. World Health Organization. Coronavirus disease (COVID- 19): situation report--209. https://www.who.int/docs /default-source/coronaviruse/situation-reports/20200816 -covid-19-sitrep-209.pdf. Updated August 16, 2020. Accessed August 24, 2020.

4. Nicola M, Alsafi Z, Sohrabi C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): a review. Int J Surg. 2020;78:185-193. doi:10.1016/j.ijsu.2020.04.018

5. da Costa VG, Moreli ML, Saivish MV. The emergence of SARS, MERS and novel SARS-2 coronaviruses in the 21st century. Arch Virol. 2020;165(7):1517-1526. doi:10.1007/s00705-020-04628-0

6. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.

7. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Updated January 15, 2019. Accessed August 24, 2020.

8. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. http:// arxiv.org/abs/1808.08230. Updated January 15, 2019. Accessed August 24, 2020.

9. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87. doi:10.1609/AIMAG.V27I4.1911

10. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229. doi:10.1147/rd.33.0210

11. Sarle WS. Neural networks and statistical models https:// people.orie.cornell.edu/davidr/or474/nn_sas.pdf. Published April 1994. Accessed August 24, 2020.

12. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85-117. doi:10.1016/j.neunet.2014.09.003

13. 13. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539

14. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44- 56. doi:10.1038/s41591-018-0300-7

15. Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. Published online March 25, 2020. Accessed May 13, 2020. http://arxiv.org/abs/2003.11597

16. Radiological Society of America. RSNA pneumonia detection challenge. https://www.kaggle.com/c/rsnapneumonia- detectionchallenge. Accessed August 24, 2020.

17. Bloice MD, Roth PM, Holzinger A. Biomedical image augmentation using Augmentor. Bioinformatics. 2019;35(21):4522-4524. doi:10.1093/bioinformatics/btz259

18. Cepheid. Xpert Xpress SARS-CoV-2. https://www.cepheid .com/coronavirus. Accessed August 24, 2020.

19. Interknowlogy. COVID-19 detection in chest X-rays. https://interknowlogy-covid-19.azurewebsites.net. Accessed August 27, 2020.

20. Bernheim A, Mei X, Huang M, et al. Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection. Radiology. 2020;295(3):200463. doi:10.1148/radiol.2020200463

21. Ai T, Yang Z, Hou H, et al. Correlation of Chest CT and RTPCR Testing for Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32- E40. doi:10.1148/radiol.2020200642

22. Simpson S, Kay FU, Abbara S, et al. Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA - Secondary Publication. J Thorac Imaging. 2020;35(4):219-227. doi:10.1097/RTI.0000000000000524

23. Wong HYF, Lam HYS, Fong AH, et al. Frequency and distribution of chest radiographic findings in patients positive for COVID-19. Radiology. 2020;296(2):E72-E78. doi:10.1148/radiol.2020201160

24. Fang Y, Zhang H, Xie J, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115-E117. doi:10.1148/radiol.2020200432

25. Chen N, Zhou M, Dong X, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507-513. doi:10.1016/S0140-6736(20)30211-7

26. Lomoro P, Verde F, Zerboni F, et al. COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review. Eur J Radiol Open. 2020;7:100231. doi:10.1016/j.ejro.2020.100231

27. Salehi S, Abedi A, Balakrishnan S, Gholamrezanezhad A. Coronavirus disease 2019 (COVID-19) imaging reporting and data system (COVID-RADS) and common lexicon: a proposal based on the imaging data of 37 studies. Eur Radiol. 2020;30(9):4930-4942. doi:10.1007/s00330-020-06863-0

28. Jacobi A, Chung M, Bernheim A, Eber C. Portable chest X-ray in coronavirus disease-19 (COVID- 19): a pictorial review. Clin Imaging. 2020;64:35-42. doi:10.1016/j.clinimag.2020.04.001

29. Bhat R, Hamid A, Kunin JR, et al. Chest imaging in patients hospitalized With COVID-19 infection - a case series. Curr Probl Diagn Radiol. 2020;49(4):294-301. doi:10.1067/j.cpradiol.2020.04.001

30. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):E271- E297. doi:10.1016/S2589-7500(19)30123-2

31. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491

32. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905

33. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. http://arxiv.org /abs/2002.11379. Updated March 11, 2020. Accessed August 24, 2020.

34. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by imagebased deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010

Article PDF
Author and Disclosure Information

Andrew Borkowski is Chief of the Molecular Diagnostics Laboratory, L. Brannon Thomas is Chief of the Microbiology Laboratory, Lauren Deland is a Research Coordinator, and Stephen Mastorides is Chief of Pathology; Narayan Viswanadhan is Assistant Chief of Radiology; all at the James A. Haley Veterans’ Hospital in Tampa, Florida. Rodney Guzman is a Cofounder of InterKnowlogy, LLC in Carlsbad, California. Andrew Borkowski and Stephen Mastorides are Professors and L. Brannon Thomas is an Assistant Professor, all in the Department of Pathology and Cell Biology, University of South Florida, Morsani College of Medicine in Tampa, Florida
Correspondence: Andrew Borkowski (andrew.borkowski@va.gov)

Author disclosures
The authors report no actual or potential conflicts of interest with regard to this article.

Disclaimer
The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Issue
Federal Practitioner - 37(9)a
Publications
Topics
Page Number
398-404
Sections
Author and Disclosure Information

Andrew Borkowski is Chief of the Molecular Diagnostics Laboratory, L. Brannon Thomas is Chief of the Microbiology Laboratory, Lauren Deland is a Research Coordinator, and Stephen Mastorides is Chief of Pathology; Narayan Viswanadhan is Assistant Chief of Radiology; all at the James A. Haley Veterans’ Hospital in Tampa, Florida. Rodney Guzman is a Cofounder of InterKnowlogy, LLC in Carlsbad, California. Andrew Borkowski and Stephen Mastorides are Professors and L. Brannon Thomas is an Assistant Professor, all in the Department of Pathology and Cell Biology, University of South Florida, Morsani College of Medicine in Tampa, Florida
Correspondence: Andrew Borkowski (andrew.borkowski@va.gov)

Author disclosures
The authors report no actual or potential conflicts of interest with regard to this article.

Disclaimer
The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Author and Disclosure Information

Andrew Borkowski is Chief of the Molecular Diagnostics Laboratory, L. Brannon Thomas is Chief of the Microbiology Laboratory, Lauren Deland is a Research Coordinator, and Stephen Mastorides is Chief of Pathology; Narayan Viswanadhan is Assistant Chief of Radiology; all at the James A. Haley Veterans’ Hospital in Tampa, Florida. Rodney Guzman is a Cofounder of InterKnowlogy, LLC in Carlsbad, California. Andrew Borkowski and Stephen Mastorides are Professors and L. Brannon Thomas is an Assistant Professor, all in the Department of Pathology and Cell Biology, University of South Florida, Morsani College of Medicine in Tampa, Florida
Correspondence: Andrew Borkowski (andrew.borkowski@va.gov)

Author disclosures
The authors report no actual or potential conflicts of interest with regard to this article.

Disclaimer
The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Article PDF
Article PDF

The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARSCoV- 2), which causes the respiratory disease coronavirus disease-19 (COVID- 19), was first identified as a cluster of cases of pneumonia in Wuhan, Hubei Province of China on December 31, 2019.1 Within a month, the disease had spread significantly, leading the World Health Organization (WHO) to designate COVID-19 a public health emergency of international concern. On March 11, 2020, the WHO declared COVID-19 a global pandemic.2 As of August 18, 2020, the virus has infected > 21 million people, with > 750,000 deaths worldwide.3 The spread of COVID-19 has had a dramatic impact on social, economic, and health care issues throughout the world, which has been discussed elsewhere.4

Prior to the this century, members of the coronavirus family had minimal impact on human health.5 However, in the past 20 years, outbreaks have highlighted an emerging importance of coronaviruses in morbidity and mortality on a global scale. Although less prevalent than COVID-19, severe acute respiratory syndrome (SARS) in 2002 to 2003 and Middle East respiratory syndrome (MERS) in 2012 likely had higher mortality rates than the current pandemic.5 Based on this recent history, it is reasonable to assume that we will continue to see novel diseases with similar significant health and societal implications. The challenges presented to health care providers (HCPs) by such novel viral pathogens are numerous, including methods for rapid diagnosis, prevention, and treatment. In the current study, we focus on diagnosis issues, which were evident with COVID-19 with the time required to develop rapid and effective diagnostic modalities.

We have previously reported the utility of using artificial intelligence (AI) in the histopathologic diagnosis of cancer.6-8 AI was first described in 1956 and involves the field of computer science in which machines are trained to learn from experience.9 Machine learning (ML) is a subset of AI and is achieved by using mathematic models to compute sample datasets.10 Current ML employs deep learning with neural network algorithms, which can recognize patterns and achieve complex computational tasks often far quicker and with increased precision than can humans.11-13 In addition to applications in pathology, ML algorithms have both prognostic and diagnostic applications in multiple medical specialties, such as radiology, dermatology, ophthalmology, and cardiology.6 It is predicted that AI will impact almost every aspect of health care in the future.14

In this article, we examine the potential for AI to diagnose patients with COVID-19 pneumonia using chest radiographs (CXR) alone. This is done using Microsoft CustomVision (www.customvision.ai), a readily available, automated ML platform. Employing AI to both screen and diagnose emerging health emergencies such as COVID-19 has the potential to dramatically change how we approach medical care in the future. In addition, we describe the creation of a publicly available website (interknowlogy-covid-19 .azurewebsites.net) that could augment COVID-19 pneumonia CXR diagnosis.

Methods

For the training dataset, 103 CXR images of COVID-19 were downloaded from GitHub covid-chest-xray dataset.15 Five hundred images of non-COVID-19 pneumonia and 500 images of the normal lung were downloaded from the Kaggle RSNA Pneumonia Detection Challenge dataset.16 To balance the dataset, we expanded the COVID-19 dataset to 500 images by slight rotation (probability = 1, max rotation = 5) and zooming (probability = 0.5, percentage area = 0.9) of the original images using the Augmentor Python package.17

Validation Dataset

For the validation dataset 30 random CXR images were obtained from the US Department of Veterans Affairs (VA) PACS (picture archiving and communication system). This dataset included 10 CXR images from hospitalized patients with COVID-19, 10 CXR pneumonia images from patients without COVID-19, and 10 normal CXRs. COVID-19 diagnoses were confirmed with a positive test result from the Xpert Xpress SARS-CoV-2 polymerase chain reaction (PCR) platform.18

 

 

Microsoft Custom

Vision Microsoft CustomVision is an automated image classification and object detection system that is a part of Microsoft Azure Cognitive Services (azure.microsoft.com). It has a pay-as-you-go model with fees depending on the computing needs and usage. It offers a free trial to users for 2 initial projects. The service is online with an easy-to-follow graphical user interface. No coding skills are necessary.

fdp03709398_f1.png

We created a new classification project in CustomVision and chose a compact general domain for small size and easy export to TensorFlow. js model format. TensorFlow.js is a JavaScript library that enables dynamic download and execution of ML models. After the project was created, we proceeded to upload our image dataset. Each class was uploaded separately and tagged with the appropriate label (covid pneumonia, non-covid pneumonia, or normal lung). The system rejected 16 COVID-19 images as duplicates. The final CustomVision training dataset consisted of 484 images of COVID-19 pneumonia, 500 images of non-COVID-19 pneumonia, and 500 images of normal lungs. Once uploaded, CustomVision self-trains using the dataset upon initiating the program (Figure 1).

 

Website Creation

CustomVision was used to train the model. It can be used to execute the model continuously, or the model can be compacted and decoupled from CustomVision. In this case, the model was compacted and decoupled for use in an online application. An Angular online application was created with TensorFlow.js. Within a user’s web browser, the model is executed when an image of a CXR is submitted. Confidence values for each classification are returned. In this design, after the initial webpage and model is downloaded, the webpage no longer needs to access any server components and performs all operations in the browser. Although the solution works well on mobile phone browsers and in low bandwidth situations, the quality of predictions may depend on the browser and device used. At no time does an image get submitted to the cloud.

fdp03709398_f2.png

Result

Overall, our trained model showed 92.9% precision and recall. Precision and recall results for each label were 98.9% and 94.8%, respectively for COVID-19 pneumonia; 91.8% and 89%, respectively, for non- COVID-19 pneumonia; and 88.8% and 95%, respectively, for normal lung (Figure 2). Next, we proceeded to validate the training model on the VA data by making individual predictions on 30 images from the VA dataset. Our model performed well with 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value (Table).

fdp03709398_t.png

 

Discussion

We successfully demonstrated the potential of using AI algorithms in assessing CXRs for COVID-19. We first trained the CustomVision automated image classification and object detection system to differentiate cases of COVID-19 from pneumonia from other etiologies as well as normal lung CXRs. We then tested our model against known patients from the James A. Haley Veterans’ Hospital in Tampa, Florida. The program achieved 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value in differentiating the 3 scenarios. Using the trained ML model, we proceeded to create a website that could augment COVID-19 CXR diagnosis.19 The website works on mobile as well as desktop platforms. A health care provider can take a CXR photo with a mobile phone or upload the image file. The ML algorithm would provide the probability of COVID-19 pneumonia, non-COVID-19 pneumonia, or normal lung diagnosis (Figure 3).

fdp03709398_f3.png

Emerging diseases such as COVID-19 present numerous challenges to HCPs, governments, and businesses, as well as to individual members of society. As evidenced with COVID-19, the time from first recognition of an emerging pathogen to the development of methods for reliable diagnosis and treatment can be months, even with a concerted international effort. The gold standard for diagnosis of COVID-19 is by reverse transcriptase PCR (RT-PCR) technologies; however, early RT-PCR testing produced less than optimal results.20-22 Even after the development of reliable tests for detection, making test kits readily available to health care providers on an adequate scale presents an additional challenge as evident with COVID-19.

Use of X-ray vs Computed Tomography

The lack of availability of diagnostic RTPCR with COVID-19 initially placed increased reliability on presumptive diagnoses via imaging in some situations.23 Most of the literature evaluating radiographs of patients with COVID-19 focuses on chest computed tomography (CT) findings, with initial results suggesting CT was more accurate than early RT-PCR methodologies.21,22,24 The Radiological Society of North America Expert consensus statement on chest CT for COVID-19 states that CT findings can even precede positivity on RT-PCR in some cases.22 However, currently it does not recommend the use of CT scanning as a screening tool. Furthermore, the actual sensitivity and specificity of CT interpretation by radiologists for COVID-19 are unknown.22

 

 

Characteristic CT findings include ground-glass opacities (GGOs) and consolidation most commonly in the lung periphery, though a diffuse distribution was found in a minority of patients.21,23,25-27 Lomoro and colleagues recently summarized the CT findings from several reports that described abnormalities as most often bilateral and peripheral, subpleural, and affecting the lower lobes.26 Not surprisingly, CT appears more sensitive at detecting changes with COVID-19 than does CXR, with reports that a minority of patients exhibited CT changes before changes were visible on CXR.23,26

We focused our study on the potential of AI in the examination of CXRs in patients with COVID-19, as there are several limitations to the routine use of CT scans with conditions such as COVID-19. Aside from the more considerable time required to obtain CTs, there are issues with contamination of CT suites, sometimes requiring a dedicated COVID-19 CT scanner.23,28 The time constraints of decontamination or limited utilization of CT suites can delay or disrupt services for patients with and without COVID-19. Because of these factors, CXR may be a better resource to minimize the risk of infection to other patients. Also, accurate assessment of abnormalities on CXR for COVID-19 may identify patients in whom the CXR was performed for other purposes.23 CXR is more readily available than CT, especially in more remote or underdeveloped areas.28 Finally, as with CT, CXR abnormalities are reported to have appeared before RT-PCR tests became positive for a minority of patients.23

CXR findings described in patients with COVID-19 are similar to those of CT and include GGOs, consolidation, and hazy increased opacities.23,25,26,28,29 Like CT, the majority of patients who received CXR demonstrated greater involvement in the lower zones and peripherally.23,25,26,28,29 Most patients showed bilateral involvement. However, while these findings are common in patients with COVID-19, they are not specific and can be seen in other conditions, such as other viral pneumonia, bacterial pneumonia, injury from drug toxicity, inhalation injury, connective tissue disease, and idiopathic conditions.

Application of AI for COVID-19

Applications of AI in interpreting radiographs of various types are numerous, and extensive literature has been written on the topic.30 Using deep learning algorithms, AI has multiple possible roles to augment traditional radiograph interpretation. These include the potential for screening, triaging, and increasing the speed to render diagnoses. It also can provide a rapid “second opinion” to the radiologist to support the final interpretation. In areas with critical shortages of radiologists, AI potentially can be used to render the definitive diagnosis. In COVID- 19, imaging studies have been shown to correlate with disease severity and mortality, and AI could assist in monitoring the course of the disease as it progresses and potentially identify patients at greatest risk.27 Furthermore, early results from PCR have been considered suboptimal, and it is known that patients with COVID-19 can test negative initially even by reliable testing methodologies. As AI technology progresses, interpretation can detect and guide triage and treatment of patients with high suspicions of COVID-19 but negative initial PCR results, or in situations where test availability is limited or results are delayed. There are numerous potential benefits should a rapid diagnostic test as simple as a CXR be able to reliably impact containment and prevention of the spread of contagions such as COVID- 19 early in its course.

Few studies have assessed using AI in the radiologic diagnosis of COVID-19, most of which use CT scanning. Bai and colleagues demonstrated increased accuracy, sensitivity, and specificity in distinguishing chest CTs of COVID-19 patients from other types of pneumonia.21,31 A separate study demonstrated the utility of using AI to differentiate COVID-19 from community-acquired pneumonia with CT.32 However, the effective utility of AI for CXR interpretation also has been demonstrated.14,33 Implementation of convolutional neural network layers has allowed for reliable differentiation of viral and bacterial pneumonia with CXR imaging.34 Evidence suggests that there is great potential in the application of AI in the interpretation of radiographs of all types.

Finally, we have developed a publicly available website based on our studies.18 This website is for research use only as it is based on data from our preliminary investigation. To appear within the website, images must have protected health information removed before uploading. The information on the website, including text, graphics, images, or other material, is for research and may not be appropriate for all circumstances. The website does not provide medical, professional, or licensed advice and is not a substitute for consultation with a HCP. Medical advice should be sought from a qualified HCP for any questions, and the website should not be used for medical diagnosis or treatment.

 

 

Limitations

In our preliminary study, we have demonstrated the potential impact AI can have in multiple aspects of patient care for emerging pathogens such as COVID-19 using a test as readily available as a CXR. However, several limitations to this investigation should be mentioned. The study is retrospective in nature with limited sample size and with X-rays from patients with various stages of COVID-19 pneumonia. Also, cases of non-COVID-19 pneumonia are not stratified into different types or etiologies. We intend to demonstrate the potential of AI in differentiating COVID-19 pneumonia from non-COVID-19 pneumonia of any etiology, though future studies should address comparison of COVID-19 cases to more specific types of pneumonias, such as of bacterial or viral origin. Furthermore, the present study does not address any potential effects of additional radiographic findings from coexistent conditions, such as pulmonary edema as seen in congestive heart failure, pleural effusions (which can be seen with COVID-19 pneumonia, though rarely), interstitial lung disease, etc. Future studies are required to address these issues. Ultimately, prospective studies to assess AI-assisted radiographic interpretation in conditions such as COVID-19 are required to demonstrate the impact on diagnosis, treatment, outcome, and patient safety as these technologies are implemented.

Conclusions

We have used a readily available, commercial platform to demonstrate the potential of AI to assist in the successful diagnosis of COVID-19 pneumonia on CXR images. While this technology has numerous applications in radiology, we have focused on the potential impact on future world health crises such as COVID-19. The findings have implications for screening and triage, initial diagnosis, monitoring disease progression, and identifying patients at increased risk of morbidity and mortality. Based on the data, a website was created to demonstrate how such technologies could be shared and distributed to others to combat entities such as COVID-19 moving forward. Our study offers a small window into the potential for how AI will likely dramatically change the practice of medicine in the future.

The novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARSCoV- 2), which causes the respiratory disease coronavirus disease-19 (COVID- 19), was first identified as a cluster of cases of pneumonia in Wuhan, Hubei Province of China on December 31, 2019.1 Within a month, the disease had spread significantly, leading the World Health Organization (WHO) to designate COVID-19 a public health emergency of international concern. On March 11, 2020, the WHO declared COVID-19 a global pandemic.2 As of August 18, 2020, the virus has infected > 21 million people, with > 750,000 deaths worldwide.3 The spread of COVID-19 has had a dramatic impact on social, economic, and health care issues throughout the world, which has been discussed elsewhere.4

Prior to the this century, members of the coronavirus family had minimal impact on human health.5 However, in the past 20 years, outbreaks have highlighted an emerging importance of coronaviruses in morbidity and mortality on a global scale. Although less prevalent than COVID-19, severe acute respiratory syndrome (SARS) in 2002 to 2003 and Middle East respiratory syndrome (MERS) in 2012 likely had higher mortality rates than the current pandemic.5 Based on this recent history, it is reasonable to assume that we will continue to see novel diseases with similar significant health and societal implications. The challenges presented to health care providers (HCPs) by such novel viral pathogens are numerous, including methods for rapid diagnosis, prevention, and treatment. In the current study, we focus on diagnosis issues, which were evident with COVID-19 with the time required to develop rapid and effective diagnostic modalities.

We have previously reported the utility of using artificial intelligence (AI) in the histopathologic diagnosis of cancer.6-8 AI was first described in 1956 and involves the field of computer science in which machines are trained to learn from experience.9 Machine learning (ML) is a subset of AI and is achieved by using mathematic models to compute sample datasets.10 Current ML employs deep learning with neural network algorithms, which can recognize patterns and achieve complex computational tasks often far quicker and with increased precision than can humans.11-13 In addition to applications in pathology, ML algorithms have both prognostic and diagnostic applications in multiple medical specialties, such as radiology, dermatology, ophthalmology, and cardiology.6 It is predicted that AI will impact almost every aspect of health care in the future.14

In this article, we examine the potential for AI to diagnose patients with COVID-19 pneumonia using chest radiographs (CXR) alone. This is done using Microsoft CustomVision (www.customvision.ai), a readily available, automated ML platform. Employing AI to both screen and diagnose emerging health emergencies such as COVID-19 has the potential to dramatically change how we approach medical care in the future. In addition, we describe the creation of a publicly available website (interknowlogy-covid-19 .azurewebsites.net) that could augment COVID-19 pneumonia CXR diagnosis.

Methods

For the training dataset, 103 CXR images of COVID-19 were downloaded from GitHub covid-chest-xray dataset.15 Five hundred images of non-COVID-19 pneumonia and 500 images of the normal lung were downloaded from the Kaggle RSNA Pneumonia Detection Challenge dataset.16 To balance the dataset, we expanded the COVID-19 dataset to 500 images by slight rotation (probability = 1, max rotation = 5) and zooming (probability = 0.5, percentage area = 0.9) of the original images using the Augmentor Python package.17

Validation Dataset

For the validation dataset 30 random CXR images were obtained from the US Department of Veterans Affairs (VA) PACS (picture archiving and communication system). This dataset included 10 CXR images from hospitalized patients with COVID-19, 10 CXR pneumonia images from patients without COVID-19, and 10 normal CXRs. COVID-19 diagnoses were confirmed with a positive test result from the Xpert Xpress SARS-CoV-2 polymerase chain reaction (PCR) platform.18

 

 

Microsoft Custom

Vision Microsoft CustomVision is an automated image classification and object detection system that is a part of Microsoft Azure Cognitive Services (azure.microsoft.com). It has a pay-as-you-go model with fees depending on the computing needs and usage. It offers a free trial to users for 2 initial projects. The service is online with an easy-to-follow graphical user interface. No coding skills are necessary.

fdp03709398_f1.png

We created a new classification project in CustomVision and chose a compact general domain for small size and easy export to TensorFlow. js model format. TensorFlow.js is a JavaScript library that enables dynamic download and execution of ML models. After the project was created, we proceeded to upload our image dataset. Each class was uploaded separately and tagged with the appropriate label (covid pneumonia, non-covid pneumonia, or normal lung). The system rejected 16 COVID-19 images as duplicates. The final CustomVision training dataset consisted of 484 images of COVID-19 pneumonia, 500 images of non-COVID-19 pneumonia, and 500 images of normal lungs. Once uploaded, CustomVision self-trains using the dataset upon initiating the program (Figure 1).

 

Website Creation

CustomVision was used to train the model. It can be used to execute the model continuously, or the model can be compacted and decoupled from CustomVision. In this case, the model was compacted and decoupled for use in an online application. An Angular online application was created with TensorFlow.js. Within a user’s web browser, the model is executed when an image of a CXR is submitted. Confidence values for each classification are returned. In this design, after the initial webpage and model is downloaded, the webpage no longer needs to access any server components and performs all operations in the browser. Although the solution works well on mobile phone browsers and in low bandwidth situations, the quality of predictions may depend on the browser and device used. At no time does an image get submitted to the cloud.

fdp03709398_f2.png

Result

Overall, our trained model showed 92.9% precision and recall. Precision and recall results for each label were 98.9% and 94.8%, respectively for COVID-19 pneumonia; 91.8% and 89%, respectively, for non- COVID-19 pneumonia; and 88.8% and 95%, respectively, for normal lung (Figure 2). Next, we proceeded to validate the training model on the VA data by making individual predictions on 30 images from the VA dataset. Our model performed well with 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value (Table).

fdp03709398_t.png

 

Discussion

We successfully demonstrated the potential of using AI algorithms in assessing CXRs for COVID-19. We first trained the CustomVision automated image classification and object detection system to differentiate cases of COVID-19 from pneumonia from other etiologies as well as normal lung CXRs. We then tested our model against known patients from the James A. Haley Veterans’ Hospital in Tampa, Florida. The program achieved 100% sensitivity (recall), 95% specificity, 97% accuracy, 91% positive predictive value (precision), and 100% negative predictive value in differentiating the 3 scenarios. Using the trained ML model, we proceeded to create a website that could augment COVID-19 CXR diagnosis.19 The website works on mobile as well as desktop platforms. A health care provider can take a CXR photo with a mobile phone or upload the image file. The ML algorithm would provide the probability of COVID-19 pneumonia, non-COVID-19 pneumonia, or normal lung diagnosis (Figure 3).

fdp03709398_f3.png

Emerging diseases such as COVID-19 present numerous challenges to HCPs, governments, and businesses, as well as to individual members of society. As evidenced with COVID-19, the time from first recognition of an emerging pathogen to the development of methods for reliable diagnosis and treatment can be months, even with a concerted international effort. The gold standard for diagnosis of COVID-19 is by reverse transcriptase PCR (RT-PCR) technologies; however, early RT-PCR testing produced less than optimal results.20-22 Even after the development of reliable tests for detection, making test kits readily available to health care providers on an adequate scale presents an additional challenge as evident with COVID-19.

Use of X-ray vs Computed Tomography

The lack of availability of diagnostic RTPCR with COVID-19 initially placed increased reliability on presumptive diagnoses via imaging in some situations.23 Most of the literature evaluating radiographs of patients with COVID-19 focuses on chest computed tomography (CT) findings, with initial results suggesting CT was more accurate than early RT-PCR methodologies.21,22,24 The Radiological Society of North America Expert consensus statement on chest CT for COVID-19 states that CT findings can even precede positivity on RT-PCR in some cases.22 However, currently it does not recommend the use of CT scanning as a screening tool. Furthermore, the actual sensitivity and specificity of CT interpretation by radiologists for COVID-19 are unknown.22

 

 

Characteristic CT findings include ground-glass opacities (GGOs) and consolidation most commonly in the lung periphery, though a diffuse distribution was found in a minority of patients.21,23,25-27 Lomoro and colleagues recently summarized the CT findings from several reports that described abnormalities as most often bilateral and peripheral, subpleural, and affecting the lower lobes.26 Not surprisingly, CT appears more sensitive at detecting changes with COVID-19 than does CXR, with reports that a minority of patients exhibited CT changes before changes were visible on CXR.23,26

We focused our study on the potential of AI in the examination of CXRs in patients with COVID-19, as there are several limitations to the routine use of CT scans with conditions such as COVID-19. Aside from the more considerable time required to obtain CTs, there are issues with contamination of CT suites, sometimes requiring a dedicated COVID-19 CT scanner.23,28 The time constraints of decontamination or limited utilization of CT suites can delay or disrupt services for patients with and without COVID-19. Because of these factors, CXR may be a better resource to minimize the risk of infection to other patients. Also, accurate assessment of abnormalities on CXR for COVID-19 may identify patients in whom the CXR was performed for other purposes.23 CXR is more readily available than CT, especially in more remote or underdeveloped areas.28 Finally, as with CT, CXR abnormalities are reported to have appeared before RT-PCR tests became positive for a minority of patients.23

CXR findings described in patients with COVID-19 are similar to those of CT and include GGOs, consolidation, and hazy increased opacities.23,25,26,28,29 Like CT, the majority of patients who received CXR demonstrated greater involvement in the lower zones and peripherally.23,25,26,28,29 Most patients showed bilateral involvement. However, while these findings are common in patients with COVID-19, they are not specific and can be seen in other conditions, such as other viral pneumonia, bacterial pneumonia, injury from drug toxicity, inhalation injury, connective tissue disease, and idiopathic conditions.

Application of AI for COVID-19

Applications of AI in interpreting radiographs of various types are numerous, and extensive literature has been written on the topic.30 Using deep learning algorithms, AI has multiple possible roles to augment traditional radiograph interpretation. These include the potential for screening, triaging, and increasing the speed to render diagnoses. It also can provide a rapid “second opinion” to the radiologist to support the final interpretation. In areas with critical shortages of radiologists, AI potentially can be used to render the definitive diagnosis. In COVID- 19, imaging studies have been shown to correlate with disease severity and mortality, and AI could assist in monitoring the course of the disease as it progresses and potentially identify patients at greatest risk.27 Furthermore, early results from PCR have been considered suboptimal, and it is known that patients with COVID-19 can test negative initially even by reliable testing methodologies. As AI technology progresses, interpretation can detect and guide triage and treatment of patients with high suspicions of COVID-19 but negative initial PCR results, or in situations where test availability is limited or results are delayed. There are numerous potential benefits should a rapid diagnostic test as simple as a CXR be able to reliably impact containment and prevention of the spread of contagions such as COVID- 19 early in its course.

Few studies have assessed using AI in the radiologic diagnosis of COVID-19, most of which use CT scanning. Bai and colleagues demonstrated increased accuracy, sensitivity, and specificity in distinguishing chest CTs of COVID-19 patients from other types of pneumonia.21,31 A separate study demonstrated the utility of using AI to differentiate COVID-19 from community-acquired pneumonia with CT.32 However, the effective utility of AI for CXR interpretation also has been demonstrated.14,33 Implementation of convolutional neural network layers has allowed for reliable differentiation of viral and bacterial pneumonia with CXR imaging.34 Evidence suggests that there is great potential in the application of AI in the interpretation of radiographs of all types.

Finally, we have developed a publicly available website based on our studies.18 This website is for research use only as it is based on data from our preliminary investigation. To appear within the website, images must have protected health information removed before uploading. The information on the website, including text, graphics, images, or other material, is for research and may not be appropriate for all circumstances. The website does not provide medical, professional, or licensed advice and is not a substitute for consultation with a HCP. Medical advice should be sought from a qualified HCP for any questions, and the website should not be used for medical diagnosis or treatment.

 

 

Limitations

In our preliminary study, we have demonstrated the potential impact AI can have in multiple aspects of patient care for emerging pathogens such as COVID-19 using a test as readily available as a CXR. However, several limitations to this investigation should be mentioned. The study is retrospective in nature with limited sample size and with X-rays from patients with various stages of COVID-19 pneumonia. Also, cases of non-COVID-19 pneumonia are not stratified into different types or etiologies. We intend to demonstrate the potential of AI in differentiating COVID-19 pneumonia from non-COVID-19 pneumonia of any etiology, though future studies should address comparison of COVID-19 cases to more specific types of pneumonias, such as of bacterial or viral origin. Furthermore, the present study does not address any potential effects of additional radiographic findings from coexistent conditions, such as pulmonary edema as seen in congestive heart failure, pleural effusions (which can be seen with COVID-19 pneumonia, though rarely), interstitial lung disease, etc. Future studies are required to address these issues. Ultimately, prospective studies to assess AI-assisted radiographic interpretation in conditions such as COVID-19 are required to demonstrate the impact on diagnosis, treatment, outcome, and patient safety as these technologies are implemented.

Conclusions

We have used a readily available, commercial platform to demonstrate the potential of AI to assist in the successful diagnosis of COVID-19 pneumonia on CXR images. While this technology has numerous applications in radiology, we have focused on the potential impact on future world health crises such as COVID-19. The findings have implications for screening and triage, initial diagnosis, monitoring disease progression, and identifying patients at increased risk of morbidity and mortality. Based on the data, a website was created to demonstrate how such technologies could be shared and distributed to others to combat entities such as COVID-19 moving forward. Our study offers a small window into the potential for how AI will likely dramatically change the practice of medicine in the future.

References

1. World Health Organization. Coronavirus disease (COVID- 19) pandemic. https://www.who.int/emergencies/diseases /novel-coronavirus2019. Updated August 23, 2020. Accessed August 24, 2020.

2. World Health Organization. WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. https://www.who.int/dg/speeches/detail/who -director-general-sopening-remarks-at-the-media-briefing -on-covid-19---11-march2020. Published March 11, 2020. Accessed August 24, 2020.

3. World Health Organization. Coronavirus disease (COVID- 19): situation report--209. https://www.who.int/docs /default-source/coronaviruse/situation-reports/20200816 -covid-19-sitrep-209.pdf. Updated August 16, 2020. Accessed August 24, 2020.

4. Nicola M, Alsafi Z, Sohrabi C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): a review. Int J Surg. 2020;78:185-193. doi:10.1016/j.ijsu.2020.04.018

5. da Costa VG, Moreli ML, Saivish MV. The emergence of SARS, MERS and novel SARS-2 coronaviruses in the 21st century. Arch Virol. 2020;165(7):1517-1526. doi:10.1007/s00705-020-04628-0

6. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.

7. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Updated January 15, 2019. Accessed August 24, 2020.

8. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. http:// arxiv.org/abs/1808.08230. Updated January 15, 2019. Accessed August 24, 2020.

9. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87. doi:10.1609/AIMAG.V27I4.1911

10. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229. doi:10.1147/rd.33.0210

11. Sarle WS. Neural networks and statistical models https:// people.orie.cornell.edu/davidr/or474/nn_sas.pdf. Published April 1994. Accessed August 24, 2020.

12. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85-117. doi:10.1016/j.neunet.2014.09.003

13. 13. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539

14. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44- 56. doi:10.1038/s41591-018-0300-7

15. Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. Published online March 25, 2020. Accessed May 13, 2020. http://arxiv.org/abs/2003.11597

16. Radiological Society of America. RSNA pneumonia detection challenge. https://www.kaggle.com/c/rsnapneumonia- detectionchallenge. Accessed August 24, 2020.

17. Bloice MD, Roth PM, Holzinger A. Biomedical image augmentation using Augmentor. Bioinformatics. 2019;35(21):4522-4524. doi:10.1093/bioinformatics/btz259

18. Cepheid. Xpert Xpress SARS-CoV-2. https://www.cepheid .com/coronavirus. Accessed August 24, 2020.

19. Interknowlogy. COVID-19 detection in chest X-rays. https://interknowlogy-covid-19.azurewebsites.net. Accessed August 27, 2020.

20. Bernheim A, Mei X, Huang M, et al. Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection. Radiology. 2020;295(3):200463. doi:10.1148/radiol.2020200463

21. Ai T, Yang Z, Hou H, et al. Correlation of Chest CT and RTPCR Testing for Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32- E40. doi:10.1148/radiol.2020200642

22. Simpson S, Kay FU, Abbara S, et al. Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA - Secondary Publication. J Thorac Imaging. 2020;35(4):219-227. doi:10.1097/RTI.0000000000000524

23. Wong HYF, Lam HYS, Fong AH, et al. Frequency and distribution of chest radiographic findings in patients positive for COVID-19. Radiology. 2020;296(2):E72-E78. doi:10.1148/radiol.2020201160

24. Fang Y, Zhang H, Xie J, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115-E117. doi:10.1148/radiol.2020200432

25. Chen N, Zhou M, Dong X, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507-513. doi:10.1016/S0140-6736(20)30211-7

26. Lomoro P, Verde F, Zerboni F, et al. COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review. Eur J Radiol Open. 2020;7:100231. doi:10.1016/j.ejro.2020.100231

27. Salehi S, Abedi A, Balakrishnan S, Gholamrezanezhad A. Coronavirus disease 2019 (COVID-19) imaging reporting and data system (COVID-RADS) and common lexicon: a proposal based on the imaging data of 37 studies. Eur Radiol. 2020;30(9):4930-4942. doi:10.1007/s00330-020-06863-0

28. Jacobi A, Chung M, Bernheim A, Eber C. Portable chest X-ray in coronavirus disease-19 (COVID- 19): a pictorial review. Clin Imaging. 2020;64:35-42. doi:10.1016/j.clinimag.2020.04.001

29. Bhat R, Hamid A, Kunin JR, et al. Chest imaging in patients hospitalized With COVID-19 infection - a case series. Curr Probl Diagn Radiol. 2020;49(4):294-301. doi:10.1067/j.cpradiol.2020.04.001

30. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):E271- E297. doi:10.1016/S2589-7500(19)30123-2

31. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491

32. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905

33. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. http://arxiv.org /abs/2002.11379. Updated March 11, 2020. Accessed August 24, 2020.

34. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by imagebased deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010

References

1. World Health Organization. Coronavirus disease (COVID- 19) pandemic. https://www.who.int/emergencies/diseases /novel-coronavirus2019. Updated August 23, 2020. Accessed August 24, 2020.

2. World Health Organization. WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. https://www.who.int/dg/speeches/detail/who -director-general-sopening-remarks-at-the-media-briefing -on-covid-19---11-march2020. Published March 11, 2020. Accessed August 24, 2020.

3. World Health Organization. Coronavirus disease (COVID- 19): situation report--209. https://www.who.int/docs /default-source/coronaviruse/situation-reports/20200816 -covid-19-sitrep-209.pdf. Updated August 16, 2020. Accessed August 24, 2020.

4. Nicola M, Alsafi Z, Sohrabi C, et al. The socio-economic implications of the coronavirus pandemic (COVID-19): a review. Int J Surg. 2020;78:185-193. doi:10.1016/j.ijsu.2020.04.018

5. da Costa VG, Moreli ML, Saivish MV. The emergence of SARS, MERS and novel SARS-2 coronaviruses in the 21st century. Arch Virol. 2020;165(7):1517-1526. doi:10.1007/s00705-020-04628-0

6. Borkowski AA, Wilson CP, Borkowski SA, et al. Comparing artificial intelligence platforms for histopathologic cancer diagnosis. Fed Pract. 2019;36(10):456-463.

7. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Updated January 15, 2019. Accessed August 24, 2020.

8. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. http:// arxiv.org/abs/1808.08230. Updated January 15, 2019. Accessed August 24, 2020.

9. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87. doi:10.1609/AIMAG.V27I4.1911

10. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229. doi:10.1147/rd.33.0210

11. Sarle WS. Neural networks and statistical models https:// people.orie.cornell.edu/davidr/or474/nn_sas.pdf. Published April 1994. Accessed August 24, 2020.

12. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2015;61:85-117. doi:10.1016/j.neunet.2014.09.003

13. 13. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444. doi:10.1038/nature14539

14. Topol EJ. High-performance medicine: the convergence of human and artificial intelligence. Nat Med. 2019;25(1):44- 56. doi:10.1038/s41591-018-0300-7

15. Cohen JP, Morrison P, Dao L. COVID-19 Image Data Collection. Published online March 25, 2020. Accessed May 13, 2020. http://arxiv.org/abs/2003.11597

16. Radiological Society of America. RSNA pneumonia detection challenge. https://www.kaggle.com/c/rsnapneumonia- detectionchallenge. Accessed August 24, 2020.

17. Bloice MD, Roth PM, Holzinger A. Biomedical image augmentation using Augmentor. Bioinformatics. 2019;35(21):4522-4524. doi:10.1093/bioinformatics/btz259

18. Cepheid. Xpert Xpress SARS-CoV-2. https://www.cepheid .com/coronavirus. Accessed August 24, 2020.

19. Interknowlogy. COVID-19 detection in chest X-rays. https://interknowlogy-covid-19.azurewebsites.net. Accessed August 27, 2020.

20. Bernheim A, Mei X, Huang M, et al. Chest CT Findings in Coronavirus Disease-19 (COVID-19): Relationship to Duration of Infection. Radiology. 2020;295(3):200463. doi:10.1148/radiol.2020200463

21. Ai T, Yang Z, Hou H, et al. Correlation of Chest CT and RTPCR Testing for Coronavirus Disease 2019 (COVID-19) in China: a report of 1014 cases. Radiology. 2020;296(2):E32- E40. doi:10.1148/radiol.2020200642

22. Simpson S, Kay FU, Abbara S, et al. Radiological Society of North America Expert Consensus Statement on Reporting Chest CT Findings Related to COVID-19. Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA - Secondary Publication. J Thorac Imaging. 2020;35(4):219-227. doi:10.1097/RTI.0000000000000524

23. Wong HYF, Lam HYS, Fong AH, et al. Frequency and distribution of chest radiographic findings in patients positive for COVID-19. Radiology. 2020;296(2):E72-E78. doi:10.1148/radiol.2020201160

24. Fang Y, Zhang H, Xie J, et al. Sensitivity of chest CT for COVID-19: comparison to RT-PCR. Radiology. 2020;296(2):E115-E117. doi:10.1148/radiol.2020200432

25. Chen N, Zhou M, Dong X, et al. Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. Lancet. 2020;395(10223):507-513. doi:10.1016/S0140-6736(20)30211-7

26. Lomoro P, Verde F, Zerboni F, et al. COVID-19 pneumonia manifestations at the admission on chest ultrasound, radiographs, and CT: single-center study and comprehensive radiologic literature review. Eur J Radiol Open. 2020;7:100231. doi:10.1016/j.ejro.2020.100231

27. Salehi S, Abedi A, Balakrishnan S, Gholamrezanezhad A. Coronavirus disease 2019 (COVID-19) imaging reporting and data system (COVID-RADS) and common lexicon: a proposal based on the imaging data of 37 studies. Eur Radiol. 2020;30(9):4930-4942. doi:10.1007/s00330-020-06863-0

28. Jacobi A, Chung M, Bernheim A, Eber C. Portable chest X-ray in coronavirus disease-19 (COVID- 19): a pictorial review. Clin Imaging. 2020;64:35-42. doi:10.1016/j.clinimag.2020.04.001

29. Bhat R, Hamid A, Kunin JR, et al. Chest imaging in patients hospitalized With COVID-19 infection - a case series. Curr Probl Diagn Radiol. 2020;49(4):294-301. doi:10.1067/j.cpradiol.2020.04.001

30. Liu X, Faes L, Kale AU, et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. Lancet Digit Heal. 2019;1(6):E271- E297. doi:10.1016/S2589-7500(19)30123-2

31. Bai HX, Wang R, Xiong Z, et al. Artificial intelligence augmentation of radiologist performance in distinguishing COVID-19 from pneumonia of other origin at chest CT. Radiology. 2020;296(3):E156-E165. doi:10.1148/radiol.2020201491

32. Li L, Qin L, Xu Z, et al. Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy. Radiology. 2020;296(2):E65-E71. doi:10.1148/radiol.2020200905

33. Rajpurkar P, Joshi A, Pareek A, et al. CheXpedition: investigating generalization challenges for translation of chest x-ray algorithms to the clinical setting. http://arxiv.org /abs/2002.11379. Updated March 11, 2020. Accessed August 24, 2020.

34. Kermany DS, Goldbaum M, Cai W, et al. Identifying medical diagnoses and treatable diseases by imagebased deep learning. Cell. 2018;172(5):1122-1131.e9. doi:10.1016/j.cell.2018.02.010

Issue
Federal Practitioner - 37(9)a
Issue
Federal Practitioner - 37(9)a
Page Number
398-404
Page Number
398-404
Publications
Publications
Topics
Article Type
Sections
Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default
Gate On Date
Fri, 09/04/2020 - 12:15
Un-Gate On Date
Fri, 09/04/2020 - 12:15
Use ProPublica
CFC Schedule Remove Status
Fri, 09/04/2020 - 12:15
Hide sidebar & use full width
render the right sidebar.
Conference Recap Checkbox
Not Conference Recap
Clinical Edge
Display the Slideshow in this Article
Article PDF Media

Comparing Artificial Intelligence Platforms for Histopathologic Cancer Diagnosis

Article Type
Changed
Mon, 10/07/2019 - 08:57
Two machine learning platforms were successfully used to provide diagnostic guidance in the differentiation between common cancer conditions in veteran populations.

Artificial intelligence (AI), first described in 1956, encompasses the field of computer science in which machines are trained to learn from experience. The term was popularized by the 1956 Dartmouth College Summer Research Project on Artificial Intelligence.1 The field of AI is rapidly growing and has the potential to affect many aspects of our lives. The emerging importance of AI is demonstrated by a February 2019 executive order that launched the American AI Initiative, allocating resources and funding for AI development.2 The executive order stresses the potential impact of AI in the health care field, including its potential utility to diagnose disease. Federal agencies were directed to invest in AI research and development to promote rapid breakthroughs in AI technology that may impact multiple areas of society.

Machine learning (ML), a subset of AI, was defined in 1959 by Arthur Samuel and is achieved by employing mathematic models to compute sample data sets.3 Originating from statistical linear models, neural networks were conceived to accomplish these tasks.4 These pioneering scientific achievements led to recent developments of deep neural networks. These models are developed to recognize patterns and achieve complex computational tasks within a matter of minutes, often far exceeding human ability.5 ML can increase efficiency with decreased computation time, high precision, and recall when compared with that of human decision making.6

ML has the potential for numerous applications in the health care field.7-9 One promising application is in the field of anatomic pathology. ML allows representative images to be used to train a computer to recognize patterns from labeled photographs. Based on a set of images selected to represent a specific tissue or disease process, the computer can be trained to evaluate and recognize new and unique images from patients and render a diagnosis.10 Prior to modern ML models, users would have to import many thousands of training images to produce algorithms that could recognize patterns with high accuracy. Modern ML algorithms allow for a model known as transfer learning, such that far fewer images are required for training.11-13

Two novel ML platforms available for public use are offered through Google (Mountain View, CA) and Apple (Cupertino, CA).14,15 They each offer a user-friendly interface with minimal experience required in computer science. Google AutoML uses ML via cloud services to store and retrieve data with ease. No coding knowledge is required. The Apple Create ML Module provides computer-based ML, requiring only a few lines of code.

The Veterans Health Administration (VHA) is the largest single health care system in the US, and nearly 50 000 cancer cases are diagnosed at the VHA annually.16 Cancers of the lung and colon are among the most common sources of invasive cancer and are the 2 most common causes of cancer deaths in America.16 We have previously reported using Apple ML in detecting non-small cell lung cancers (NSCLCs), including adenocarcinomas and squamous cell carcinomas (SCCs); and colon cancers with accuracy.17,18 In the present study, we expand on these findings by comparing Apple and Google ML platforms in a variety of common pathologic scenarios in veteran patients. Using limited training data, both programs are compared for precision and recall in differentiating conditions involving lung and colon pathology.

In the first 4 experiments, we evaluated the ability of the platforms to differentiate normal lung tissue from cancerous lung tissue, to distinguish lung adenocarcinoma from SCC, and to differentiate colon adenocarcinoma from normal colon tissue. Next, cases of colon adenocarcinoma were assessed to determine whether the presence or absence of the KRAS proto-oncogene could be determined histologically using the AI platforms. KRAS is found in a variety of cancers, including about 40% of colon adenocarcinomas.19 For colon cancers, the presence or absence of the mutation in KRAS has important implications for patients as it determines whether the tumor will respond to specific chemotherapy agents.20 The presence of the KRAS gene is currently determined by complex molecular testing of tumor tissue.21 However, we assessed the potential of ML to determine whether the mutation is present by computerized morphologic analysis alone. Our last experiment examined the ability of the Apple and Google platforms to differentiate between adenocarcinomas of lung origin vs colon origin. This has potential utility in determining the site of origin of metastatic carcinoma.22

 

 

Methods

Fifty cases of lung SCC, 50 cases of lung adenocarcinoma, and 50 cases of colon adenocarcinoma were randomly retrieved from our molecular database. Twenty-five colon adenocarcinoma cases were positive for mutation in KRAS, while 25 cases were negative for mutation in KRAS. Seven hundred fifty total images of lung tissue (250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung SCCs) and 500 total images of colon tissue (250 benign colon tissue and 250 colon adenocarcinoma) were obtained using a Leica Microscope MC190 HD Camera (Wetzlar, Germany) connected to an Olympus BX41 microscope (Center Valley, PA) and the Leica Acquire 9072 software for Apple computers. All the images were captured at a resolution of 1024 x 768 pixels using a 60x dry objective. Lung tissue images were captured and saved on a 2012 Apple MacBook Pro computer, and colon images were captured and saved on a 2011 Apple iMac computer. Both computers were running macOS v10.13.

Creating Image Classifier Models Using Apple Create ML

Apple Create ML is a suite of products that use various tools to create and train custom ML models on Apple computers.15 The suite contains many features, including image classification to train a ML model to classify images, natural language processing to classify natural language text, and tabular data to train models that deal with labeling information or estimating new quantities. We used Create ML Image Classification to create image classifier models for our project (Appendix A).

fed03610456_appendix_ab.png

Creating ML Modules Using Google Cloud AutoML Vision Beta

Google Cloud AutoML is a suite of machine learning products, including AutoML Vision, AutoML Natural Language and AutoML Translation.14 All Cloud AutoML machine learning products were in beta version at the time of experimentation. We used Cloud AutoML Vision beta to create ML modules for our project. Unlike Apple Create ML, which is run on a local Apple computer, the Google Cloud AutoML is run online using a Google Cloud account. There are no minimum specifications requirements for the local computer since it is using the cloud-based architecture (Appendix B).

 

Experiment 1

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect and subclassify NSCLC based on the histopathologic images. We created 3 classes of images (250 images each): benign lung tissue, lung adenocarcinoma, and lung SCC.

Experiment 2

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between normal lung tissue and NSCLC histopathologic images with 50/50 mixture of lung adenocarcinoma and lung SCC. We created 2 classes of images (250 images each): benign lung tissue and lung NSCLC.

Experiment 3

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and lung SCC histopathologic images. We created 2 classes of images (250 images each): adenocarcinoma and SCC.

Experiment 4

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect colon cancer histopathologic images regardless of mutation in KRAS status. We created 2 classes of images (250 images each): benign colon tissue and colon adenocarcinoma.

 

 

Experiment 5

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between colon adenocarcinoma with mutations in KRAS and colon adenocarcinoma without the mutation in KRAS histopathologic images. We created 2 classes of images (125 images each): colon adenocarcinoma cases with mutation in KRAS and colon adenocarcinoma cases without the mutation in KRAS.

fed03610456_t.png

Experiment 6

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and colon adenocarcinoma histopathologic images. We created 2 classes of images (250 images each): colon adenocarcinoma lung adenocarcinoma.

Results

Twelve machine learning models were created in 6 experiments using the Apple Create ML and the Google AutoML (Table). To investigate recall and precision differences between the Apple and the Google ML algorithms, we performed 2-tailed distribution, paired t tests. No statistically significant differences were found (P = .52 for recall and .60 for precision).

fed03610456_f1.png

fed03610456_f2.png

Overall, each model performed well in distinguishing between normal and neoplastic tissue for both lung and colon cancers. In subclassifying NSCLC into adenocarcinoma and SCC, the models were shown to have high levels of precision and recall. The models also were successful in distinguishing between lung and colonic origin of adenocarcinoma (Figures 1-4). However, both systems had trouble discerning colon adenocarcinoma with mutations in KRAS from adenocarcinoma without mutations in KRAS.

 

Discussion

Image classifier models using ML algorithms hold a promising future to revolutionize the health care field. ML products, such as those modules offered by Apple and Google, are easy to use and have a simple graphic user interface to allow individuals to train models to perform humanlike tasks in real time. In our experiments, we compared multiple algorithms to determine their ability to differentiate and subclassify histopathologic images with high precision and recall using common scenarios in treating veteran patients.

fed03610456_f3.png

Analysis of the results revealed high precision and recall values illustrating the models’ ability to differentiate and detect benign lung tissue from lung SCC and lung adenocarcinoma in ML model 1, benign lung from NSCLC carcinoma in ML model 2, and benign colon from colonic adenocarcinoma in ML model 4. In ML model 3 and 6, both ML algorithms performed at a high level to differentiate lung SCC from lung adenocarcinoma and lung adenocarcinoma from colonic adenocarcinoma, respectively. Of note, ML model 5 had the lowest precision and recall values across both algorithms demonstrating the models’ limited utility in predicting molecular profiles, such as mutations in KRAS as tested here. This is not surprising as pathologists currently require complex molecular tests to detect mutations in KRAS reliably in colon cancer.

fed03610456_f4.png

Both modules require minimal programming experience and are easy to use. In our comparison, we demonstrated critical distinguishing characteristics that differentiate the 2 products.

Apple Create ML image classifier is available for use on local Mac computers that use Xcode version 10 and macOS 10.14 or later, with just 3 lines of code required to perform computations. Although this product is limited to Apple computers, it is free to use, and images are stored on the computer hard drive. Of unique significance on the Apple system platform, images can be augmented to alter their appearance to enhance model training. For example, imported images can be cropped, rotated, blurred, and flipped, in order to optimize the model’s training abilities to recognize test images and perform pattern recognition. This feature is not as readily available on the Google platform. Apple Create ML Image classifier’s default training set consists of 75% of total imported images with 5% of the total images being randomly used as a validation set. The remaining 20% of images comprise the testing set. The module’s computational analysis to train the model is achieved in about 2 minutes on average. The score threshold is set at 50% and cannot be manipulated for each image class as in Google AutoML Vision.

Google AutoML Vision is open and can be accessed from many devices. It stores images on remote Google servers but requires computing fees after a $300 credit for 12 months. On AutoML Vision, random 80% of the total images are used in the training set, 10% are used in the validation set, and 10% are used in the testing set. It is important to highlight the different percentages used in the default settings on the respective modules. The time to train the Google AutoML Vision with default computational power is longer on average than Apple Create ML, with about 8 minutes required to train the machine learning module. However, it is possible to choose more computational power for an additional fee and decrease module training time. The user will receive e-mail alerts when the computer time begins and is completed. The computation time is calculated by subtracting the time of the initial e-mail from the final e-mail.

Based on our calculations, we determined there was no significant difference between the 2 machine learning algorithms tested at the default settings with recall and precision values obtained. These findings demonstrate the promise of using a ML algorithm to assist in the performance of human tasks and behaviors, specifically the diagnosis of histopathologic images. These results have numerous potential uses in clinical medicine. ML algorithms have been successfully applied to diagnostic and prognostic endeavors in pathology,23-28 dermatology,29-31 ophthalmology,32 cardiology,33 and radiology.34-36

Pathologists often use additional tests, such as special staining of tissues or molecular tests, to assist with accurate classification of tumors. ML platforms offer the potential of an additional tool for pathologists to use along with human microscopic interpretation.37,38 In addition, the number of pathologists in the US is dramatically decreasing, and many other countries have marked physician shortages, especially in fields of specialized training such as pathology.39-42 These models could readily assist physicians in underserved countries and impact shortages of pathologists elsewhere by providing more specific diagnoses in an expedited manner.43

Finally, although we have explored the application of these platforms in common cancer scenarios, great potential exists to use similar techniques in the detection of other conditions. These include the potential for classification and risk assessment of precancerous lesions, infectious processes in tissue (eg, detection of tuberculosis or malaria),24,44 inflammatory conditions (eg, arthritis subtypes, gout),45 blood disorders (eg, abnormal blood cell morphology),46 and many others. The potential of these technologies to improve health care delivery to veteran patients seems to be limited only by the imagination of the user.47

Regarding the limited effectiveness in determining the presence or absence of mutations in KRAS for colon adenocarcinoma, it is mentioned that currently pathologists rely on complex molecular tests to detect the mutations at the DNA level.21 It is possible that the use of more extensive training data sets may improve recall and precision in cases such as these and warrants further study. Our experiments were limited to the stipulations placed by the free trial software agreements; no costs were expended to use the algorithms, though an Apple computer was required.

 

 

Conclusion

We have demonstrated the successful application of 2 readily available ML platforms in providing diagnostic guidance in differentiation between common cancer conditions in veteran patient populations. Although both platforms performed very well with no statistically significant differences in results, some distinctions are worth noting. Apple Create ML can be used on local computers but is limited to an Apple operating system. Google AutoML is not platform-specific but runs only via Google Cloud with associated computational fees. Using these readily available models, we demonstrated the vast potential of AI in diagnostic pathology. The application of AI to clinical medicine remains in the very early stages. The VA is uniquely poised to provide leadership as AI technologies will continue to dramatically change the future of health care, both in veteran and nonveteran patients nationwide.

Acknowledgments

The authors thank Paul Borkowski for his constructive criticism and proofreading of this manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.

References

1. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87-91.

2. Trump D. Accelerating America’s leadership in artificial intelligence. https://www.whitehouse.gov/articles/accelerating-americas-leadership-in-artificial-intelligence. Published February 11, 2019. Accessed September 4, 2019.

3. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229.

4. SAS Users Group International. Neural networks and statistical models. In: Sarle WS. Proceedings of the Nineteenth Annual SAS Users Group International Conference. SAS Institute: Cary, North Carolina; 1994:1538-1550. http://www.sascommunity.org/sugi/SUGI94/Sugi-94-255%20Sarle.pdf. Accessed September 16, 2019.

5. Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks. 2015;61:85-117.

6. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444.

7. Jiang F, Jiang Y, Li H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243.

8. Erickson BJ, Korfiatis P, Akkus Z, Kline TL. Machine learning for medical imaging. Radiographics. 2017;37(2):505-515.

9. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930.

10. Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform. 2016;7(1):29.

11. Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. Presented at: IEEE Conference on Computer Vision and Pattern Recognition, 2014. http://openaccess.thecvf.com/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html. Accessed September 4, 2019.

12. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging. 2016;35(5):1285-1298.

13. Tajbakhsh N, Shin JY, Gurudu SR, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging. 2016;35(5):1299-1312.

14. Cloud AutoML. https://cloud.google.com/automl. Accessed September 4, 2019.

15. Create ML. https://developer.apple.com/documentation/createml. Accessed September 4, 2019.

16. Zullig LL, Sims KJ, McNeil R, et al. Cancer incidence among patients of the U.S. Veterans Affairs Health Care System: 2010 Update. Mil Med. 2017;182(7):e1883-e1891. 17. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. https://arxiv.org/ftp/arxiv/papers/1808/1808.08230.pdf. Accessed September 4, 2019.

18. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Revised January 15,2019. Accessed September 4, 2019.

19. Armaghany T, Wilson JD, Chu Q, Mills G. Genetic alterations in colorectal cancer. Gastrointest Cancer Res. 2012;5(1):19-27.

20. Herzig DO, Tsikitis VL. Molecular markers for colon diagnosis, prognosis and targeted therapy. J Surg Oncol. 2015;111(1):96-102.

21. Ma W, Brodie S, Agersborg S, Funari VA, Albitar M. Significant improvement in detecting BRAF, KRAS, and EGFR mutations using next-generation sequencing as compared with FDA-cleared kits. Mol Diagn Ther. 2017;21(5):571-579.

22. Greco FA. Molecular diagnosis of the tissue of origin in cancer of unknown primary site: useful in patient management. Curr Treat Options Oncol. 2013;14(4):634-642.

23. Bejnordi BE, Veta M, van Diest PJ, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210.

24. Xiong Y, Ba X, Hou A, Zhang K, Chen L, Li T. Automatic detection of mycobacterium tuberculosis using artificial intelligence. J Thorac Dis. 2018;10(3):1936-1940.

25. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450.

26. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567.

27. Ertosun MG, Rubin DL. Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. AMIA Annu Symp Proc. 2015;2015:1899-1908.

28. Wahab N, Khan A, Lee YS. Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med. 2017;85:86-97.

29. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.

30. Han SS, Park GH, Lim W, et al. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018;13(1):e0191493.

31. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br J Dermatol. 2019;180(2):373-381.

32. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2010.

33. Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017;12(4):e0174944.

34. Cheng J-Z, Ni D, Chou Y-H, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6(1):24454.

35. Wang X, Yang W, Weinreb J, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning. Sci Rep. 2017;7(1):15415.

36. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582.

37. Bardou D, Zhang K, Ahmad SM. Classification of breast cancer based on histology images using convolutional neural networks. IEEE Access. 2018;6(6):24680-24693.

38. Sheikhzadeh F, Ward RK, van Niekerk D, Guillaud M. Automatic labeling of molecular biomarkers of immunohistochemistry images using fully convolutional networks. PLoS One. 2018;13(1):e0190783.

39. Metter DM, Colgan TJ, Leung ST, Timmons CF, Park JY. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw Open. 2019;2(5):e194337.

40. Benediktsson, H, Whitelaw J, Roy I. Pathology services in developing countries: a challenge. Arch Pathol Lab Med. 2007;131(11):1636-1639.

41. Graves D. The impact of the pathology workforce crisis on acute health care. Aust Health Rev. 2007;31(suppl 1):S28-S30.

42. NHS pathology shortages cause cancer diagnosis delays. https://www.gmjournal.co.uk/nhs-pathology-shortages-are-causing-cancer-diagnosis-delays. Published September 18, 2018. Accessed September 4, 2019.

43. Abbott LM, Smith SD. Smartphone apps for skin cancer diagnosis: Implications for patients and practitioners. Australas J Dermatol. 2018;59(3):168-170.

44. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G. Image analysis and machine learning for detecting malaria. Transl Res. 2018;194:36-55.

45. Orange DE, Agius P, DiCarlo EF, et al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheumatol. 2018;70(5):690-701.

46. Rodellar J, Alférez S, Acevedo A, Molina A, Merino A. Image processing and machine learning in the morphological analysis of blood cells. Int J Lab Hematol. 2018;40(suppl 1):46-53.

47. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.

Article PDF
Author and Disclosure Information

Andrew Borkowski is Chief of the Molecular Diagnostics Laboratory; Catherine Wilson is a Medical Technologist; Steven Borkowski is a Research Consultant; Brannon Thomas is Chief of the Microbiology Laboratory; Lauren Deland is a Research Coordinator; and Stephen Mastorides is Chief of the Pathology and Laboratory Medicine Service; all at James A. Haley Veterans’ Hospital in Tampa, Florida. Andrew Borkowski is a Professor; L. Brannon Thomas is an Assistant Professor; Stefanie Grewe is a Pathology Resident; and Stephen Mastorides is a Professor; all at the University of South Florida Morsani College of Medicine in Tampa.
Correspondence: Andrew Borkowski (andrew.borkowski@va.gov)

Author disclosures
The authors report no actual or potential conflicts of interest with regard to this article.

Disclaimer
The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Issue
Federal Practitioner - 36(10)a
Publications
Topics
Page Number
456-463
Sections
Author and Disclosure Information

Andrew Borkowski is Chief of the Molecular Diagnostics Laboratory; Catherine Wilson is a Medical Technologist; Steven Borkowski is a Research Consultant; Brannon Thomas is Chief of the Microbiology Laboratory; Lauren Deland is a Research Coordinator; and Stephen Mastorides is Chief of the Pathology and Laboratory Medicine Service; all at James A. Haley Veterans’ Hospital in Tampa, Florida. Andrew Borkowski is a Professor; L. Brannon Thomas is an Assistant Professor; Stefanie Grewe is a Pathology Resident; and Stephen Mastorides is a Professor; all at the University of South Florida Morsani College of Medicine in Tampa.
Correspondence: Andrew Borkowski (andrew.borkowski@va.gov)

Author disclosures
The authors report no actual or potential conflicts of interest with regard to this article.

Disclaimer
The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Author and Disclosure Information

Andrew Borkowski is Chief of the Molecular Diagnostics Laboratory; Catherine Wilson is a Medical Technologist; Steven Borkowski is a Research Consultant; Brannon Thomas is Chief of the Microbiology Laboratory; Lauren Deland is a Research Coordinator; and Stephen Mastorides is Chief of the Pathology and Laboratory Medicine Service; all at James A. Haley Veterans’ Hospital in Tampa, Florida. Andrew Borkowski is a Professor; L. Brannon Thomas is an Assistant Professor; Stefanie Grewe is a Pathology Resident; and Stephen Mastorides is a Professor; all at the University of South Florida Morsani College of Medicine in Tampa.
Correspondence: Andrew Borkowski (andrew.borkowski@va.gov)

Author disclosures
The authors report no actual or potential conflicts of interest with regard to this article.

Disclaimer
The opinions expressed herein are those of the authors and do not necessarily reflect those of Federal Practitioner, Frontline Medical Communications Inc., the US Government, or any of its agencies.

Article PDF
Article PDF
Related Articles
Two machine learning platforms were successfully used to provide diagnostic guidance in the differentiation between common cancer conditions in veteran populations.
Two machine learning platforms were successfully used to provide diagnostic guidance in the differentiation between common cancer conditions in veteran populations.

Artificial intelligence (AI), first described in 1956, encompasses the field of computer science in which machines are trained to learn from experience. The term was popularized by the 1956 Dartmouth College Summer Research Project on Artificial Intelligence.1 The field of AI is rapidly growing and has the potential to affect many aspects of our lives. The emerging importance of AI is demonstrated by a February 2019 executive order that launched the American AI Initiative, allocating resources and funding for AI development.2 The executive order stresses the potential impact of AI in the health care field, including its potential utility to diagnose disease. Federal agencies were directed to invest in AI research and development to promote rapid breakthroughs in AI technology that may impact multiple areas of society.

Machine learning (ML), a subset of AI, was defined in 1959 by Arthur Samuel and is achieved by employing mathematic models to compute sample data sets.3 Originating from statistical linear models, neural networks were conceived to accomplish these tasks.4 These pioneering scientific achievements led to recent developments of deep neural networks. These models are developed to recognize patterns and achieve complex computational tasks within a matter of minutes, often far exceeding human ability.5 ML can increase efficiency with decreased computation time, high precision, and recall when compared with that of human decision making.6

ML has the potential for numerous applications in the health care field.7-9 One promising application is in the field of anatomic pathology. ML allows representative images to be used to train a computer to recognize patterns from labeled photographs. Based on a set of images selected to represent a specific tissue or disease process, the computer can be trained to evaluate and recognize new and unique images from patients and render a diagnosis.10 Prior to modern ML models, users would have to import many thousands of training images to produce algorithms that could recognize patterns with high accuracy. Modern ML algorithms allow for a model known as transfer learning, such that far fewer images are required for training.11-13

Two novel ML platforms available for public use are offered through Google (Mountain View, CA) and Apple (Cupertino, CA).14,15 They each offer a user-friendly interface with minimal experience required in computer science. Google AutoML uses ML via cloud services to store and retrieve data with ease. No coding knowledge is required. The Apple Create ML Module provides computer-based ML, requiring only a few lines of code.

The Veterans Health Administration (VHA) is the largest single health care system in the US, and nearly 50 000 cancer cases are diagnosed at the VHA annually.16 Cancers of the lung and colon are among the most common sources of invasive cancer and are the 2 most common causes of cancer deaths in America.16 We have previously reported using Apple ML in detecting non-small cell lung cancers (NSCLCs), including adenocarcinomas and squamous cell carcinomas (SCCs); and colon cancers with accuracy.17,18 In the present study, we expand on these findings by comparing Apple and Google ML platforms in a variety of common pathologic scenarios in veteran patients. Using limited training data, both programs are compared for precision and recall in differentiating conditions involving lung and colon pathology.

In the first 4 experiments, we evaluated the ability of the platforms to differentiate normal lung tissue from cancerous lung tissue, to distinguish lung adenocarcinoma from SCC, and to differentiate colon adenocarcinoma from normal colon tissue. Next, cases of colon adenocarcinoma were assessed to determine whether the presence or absence of the KRAS proto-oncogene could be determined histologically using the AI platforms. KRAS is found in a variety of cancers, including about 40% of colon adenocarcinomas.19 For colon cancers, the presence or absence of the mutation in KRAS has important implications for patients as it determines whether the tumor will respond to specific chemotherapy agents.20 The presence of the KRAS gene is currently determined by complex molecular testing of tumor tissue.21 However, we assessed the potential of ML to determine whether the mutation is present by computerized morphologic analysis alone. Our last experiment examined the ability of the Apple and Google platforms to differentiate between adenocarcinomas of lung origin vs colon origin. This has potential utility in determining the site of origin of metastatic carcinoma.22

 

 

Methods

Fifty cases of lung SCC, 50 cases of lung adenocarcinoma, and 50 cases of colon adenocarcinoma were randomly retrieved from our molecular database. Twenty-five colon adenocarcinoma cases were positive for mutation in KRAS, while 25 cases were negative for mutation in KRAS. Seven hundred fifty total images of lung tissue (250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung SCCs) and 500 total images of colon tissue (250 benign colon tissue and 250 colon adenocarcinoma) were obtained using a Leica Microscope MC190 HD Camera (Wetzlar, Germany) connected to an Olympus BX41 microscope (Center Valley, PA) and the Leica Acquire 9072 software for Apple computers. All the images were captured at a resolution of 1024 x 768 pixels using a 60x dry objective. Lung tissue images were captured and saved on a 2012 Apple MacBook Pro computer, and colon images were captured and saved on a 2011 Apple iMac computer. Both computers were running macOS v10.13.

Creating Image Classifier Models Using Apple Create ML

Apple Create ML is a suite of products that use various tools to create and train custom ML models on Apple computers.15 The suite contains many features, including image classification to train a ML model to classify images, natural language processing to classify natural language text, and tabular data to train models that deal with labeling information or estimating new quantities. We used Create ML Image Classification to create image classifier models for our project (Appendix A).

fed03610456_appendix_ab.png

Creating ML Modules Using Google Cloud AutoML Vision Beta

Google Cloud AutoML is a suite of machine learning products, including AutoML Vision, AutoML Natural Language and AutoML Translation.14 All Cloud AutoML machine learning products were in beta version at the time of experimentation. We used Cloud AutoML Vision beta to create ML modules for our project. Unlike Apple Create ML, which is run on a local Apple computer, the Google Cloud AutoML is run online using a Google Cloud account. There are no minimum specifications requirements for the local computer since it is using the cloud-based architecture (Appendix B).

 

Experiment 1

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect and subclassify NSCLC based on the histopathologic images. We created 3 classes of images (250 images each): benign lung tissue, lung adenocarcinoma, and lung SCC.

Experiment 2

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between normal lung tissue and NSCLC histopathologic images with 50/50 mixture of lung adenocarcinoma and lung SCC. We created 2 classes of images (250 images each): benign lung tissue and lung NSCLC.

Experiment 3

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and lung SCC histopathologic images. We created 2 classes of images (250 images each): adenocarcinoma and SCC.

Experiment 4

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect colon cancer histopathologic images regardless of mutation in KRAS status. We created 2 classes of images (250 images each): benign colon tissue and colon adenocarcinoma.

 

 

Experiment 5

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between colon adenocarcinoma with mutations in KRAS and colon adenocarcinoma without the mutation in KRAS histopathologic images. We created 2 classes of images (125 images each): colon adenocarcinoma cases with mutation in KRAS and colon adenocarcinoma cases without the mutation in KRAS.

fed03610456_t.png

Experiment 6

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and colon adenocarcinoma histopathologic images. We created 2 classes of images (250 images each): colon adenocarcinoma lung adenocarcinoma.

Results

Twelve machine learning models were created in 6 experiments using the Apple Create ML and the Google AutoML (Table). To investigate recall and precision differences between the Apple and the Google ML algorithms, we performed 2-tailed distribution, paired t tests. No statistically significant differences were found (P = .52 for recall and .60 for precision).

fed03610456_f1.png

fed03610456_f2.png

Overall, each model performed well in distinguishing between normal and neoplastic tissue for both lung and colon cancers. In subclassifying NSCLC into adenocarcinoma and SCC, the models were shown to have high levels of precision and recall. The models also were successful in distinguishing between lung and colonic origin of adenocarcinoma (Figures 1-4). However, both systems had trouble discerning colon adenocarcinoma with mutations in KRAS from adenocarcinoma without mutations in KRAS.

 

Discussion

Image classifier models using ML algorithms hold a promising future to revolutionize the health care field. ML products, such as those modules offered by Apple and Google, are easy to use and have a simple graphic user interface to allow individuals to train models to perform humanlike tasks in real time. In our experiments, we compared multiple algorithms to determine their ability to differentiate and subclassify histopathologic images with high precision and recall using common scenarios in treating veteran patients.

fed03610456_f3.png

Analysis of the results revealed high precision and recall values illustrating the models’ ability to differentiate and detect benign lung tissue from lung SCC and lung adenocarcinoma in ML model 1, benign lung from NSCLC carcinoma in ML model 2, and benign colon from colonic adenocarcinoma in ML model 4. In ML model 3 and 6, both ML algorithms performed at a high level to differentiate lung SCC from lung adenocarcinoma and lung adenocarcinoma from colonic adenocarcinoma, respectively. Of note, ML model 5 had the lowest precision and recall values across both algorithms demonstrating the models’ limited utility in predicting molecular profiles, such as mutations in KRAS as tested here. This is not surprising as pathologists currently require complex molecular tests to detect mutations in KRAS reliably in colon cancer.

fed03610456_f4.png

Both modules require minimal programming experience and are easy to use. In our comparison, we demonstrated critical distinguishing characteristics that differentiate the 2 products.

Apple Create ML image classifier is available for use on local Mac computers that use Xcode version 10 and macOS 10.14 or later, with just 3 lines of code required to perform computations. Although this product is limited to Apple computers, it is free to use, and images are stored on the computer hard drive. Of unique significance on the Apple system platform, images can be augmented to alter their appearance to enhance model training. For example, imported images can be cropped, rotated, blurred, and flipped, in order to optimize the model’s training abilities to recognize test images and perform pattern recognition. This feature is not as readily available on the Google platform. Apple Create ML Image classifier’s default training set consists of 75% of total imported images with 5% of the total images being randomly used as a validation set. The remaining 20% of images comprise the testing set. The module’s computational analysis to train the model is achieved in about 2 minutes on average. The score threshold is set at 50% and cannot be manipulated for each image class as in Google AutoML Vision.

Google AutoML Vision is open and can be accessed from many devices. It stores images on remote Google servers but requires computing fees after a $300 credit for 12 months. On AutoML Vision, random 80% of the total images are used in the training set, 10% are used in the validation set, and 10% are used in the testing set. It is important to highlight the different percentages used in the default settings on the respective modules. The time to train the Google AutoML Vision with default computational power is longer on average than Apple Create ML, with about 8 minutes required to train the machine learning module. However, it is possible to choose more computational power for an additional fee and decrease module training time. The user will receive e-mail alerts when the computer time begins and is completed. The computation time is calculated by subtracting the time of the initial e-mail from the final e-mail.

Based on our calculations, we determined there was no significant difference between the 2 machine learning algorithms tested at the default settings with recall and precision values obtained. These findings demonstrate the promise of using a ML algorithm to assist in the performance of human tasks and behaviors, specifically the diagnosis of histopathologic images. These results have numerous potential uses in clinical medicine. ML algorithms have been successfully applied to diagnostic and prognostic endeavors in pathology,23-28 dermatology,29-31 ophthalmology,32 cardiology,33 and radiology.34-36

Pathologists often use additional tests, such as special staining of tissues or molecular tests, to assist with accurate classification of tumors. ML platforms offer the potential of an additional tool for pathologists to use along with human microscopic interpretation.37,38 In addition, the number of pathologists in the US is dramatically decreasing, and many other countries have marked physician shortages, especially in fields of specialized training such as pathology.39-42 These models could readily assist physicians in underserved countries and impact shortages of pathologists elsewhere by providing more specific diagnoses in an expedited manner.43

Finally, although we have explored the application of these platforms in common cancer scenarios, great potential exists to use similar techniques in the detection of other conditions. These include the potential for classification and risk assessment of precancerous lesions, infectious processes in tissue (eg, detection of tuberculosis or malaria),24,44 inflammatory conditions (eg, arthritis subtypes, gout),45 blood disorders (eg, abnormal blood cell morphology),46 and many others. The potential of these technologies to improve health care delivery to veteran patients seems to be limited only by the imagination of the user.47

Regarding the limited effectiveness in determining the presence or absence of mutations in KRAS for colon adenocarcinoma, it is mentioned that currently pathologists rely on complex molecular tests to detect the mutations at the DNA level.21 It is possible that the use of more extensive training data sets may improve recall and precision in cases such as these and warrants further study. Our experiments were limited to the stipulations placed by the free trial software agreements; no costs were expended to use the algorithms, though an Apple computer was required.

 

 

Conclusion

We have demonstrated the successful application of 2 readily available ML platforms in providing diagnostic guidance in differentiation between common cancer conditions in veteran patient populations. Although both platforms performed very well with no statistically significant differences in results, some distinctions are worth noting. Apple Create ML can be used on local computers but is limited to an Apple operating system. Google AutoML is not platform-specific but runs only via Google Cloud with associated computational fees. Using these readily available models, we demonstrated the vast potential of AI in diagnostic pathology. The application of AI to clinical medicine remains in the very early stages. The VA is uniquely poised to provide leadership as AI technologies will continue to dramatically change the future of health care, both in veteran and nonveteran patients nationwide.

Acknowledgments

The authors thank Paul Borkowski for his constructive criticism and proofreading of this manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.

Artificial intelligence (AI), first described in 1956, encompasses the field of computer science in which machines are trained to learn from experience. The term was popularized by the 1956 Dartmouth College Summer Research Project on Artificial Intelligence.1 The field of AI is rapidly growing and has the potential to affect many aspects of our lives. The emerging importance of AI is demonstrated by a February 2019 executive order that launched the American AI Initiative, allocating resources and funding for AI development.2 The executive order stresses the potential impact of AI in the health care field, including its potential utility to diagnose disease. Federal agencies were directed to invest in AI research and development to promote rapid breakthroughs in AI technology that may impact multiple areas of society.

Machine learning (ML), a subset of AI, was defined in 1959 by Arthur Samuel and is achieved by employing mathematic models to compute sample data sets.3 Originating from statistical linear models, neural networks were conceived to accomplish these tasks.4 These pioneering scientific achievements led to recent developments of deep neural networks. These models are developed to recognize patterns and achieve complex computational tasks within a matter of minutes, often far exceeding human ability.5 ML can increase efficiency with decreased computation time, high precision, and recall when compared with that of human decision making.6

ML has the potential for numerous applications in the health care field.7-9 One promising application is in the field of anatomic pathology. ML allows representative images to be used to train a computer to recognize patterns from labeled photographs. Based on a set of images selected to represent a specific tissue or disease process, the computer can be trained to evaluate and recognize new and unique images from patients and render a diagnosis.10 Prior to modern ML models, users would have to import many thousands of training images to produce algorithms that could recognize patterns with high accuracy. Modern ML algorithms allow for a model known as transfer learning, such that far fewer images are required for training.11-13

Two novel ML platforms available for public use are offered through Google (Mountain View, CA) and Apple (Cupertino, CA).14,15 They each offer a user-friendly interface with minimal experience required in computer science. Google AutoML uses ML via cloud services to store and retrieve data with ease. No coding knowledge is required. The Apple Create ML Module provides computer-based ML, requiring only a few lines of code.

The Veterans Health Administration (VHA) is the largest single health care system in the US, and nearly 50 000 cancer cases are diagnosed at the VHA annually.16 Cancers of the lung and colon are among the most common sources of invasive cancer and are the 2 most common causes of cancer deaths in America.16 We have previously reported using Apple ML in detecting non-small cell lung cancers (NSCLCs), including adenocarcinomas and squamous cell carcinomas (SCCs); and colon cancers with accuracy.17,18 In the present study, we expand on these findings by comparing Apple and Google ML platforms in a variety of common pathologic scenarios in veteran patients. Using limited training data, both programs are compared for precision and recall in differentiating conditions involving lung and colon pathology.

In the first 4 experiments, we evaluated the ability of the platforms to differentiate normal lung tissue from cancerous lung tissue, to distinguish lung adenocarcinoma from SCC, and to differentiate colon adenocarcinoma from normal colon tissue. Next, cases of colon adenocarcinoma were assessed to determine whether the presence or absence of the KRAS proto-oncogene could be determined histologically using the AI platforms. KRAS is found in a variety of cancers, including about 40% of colon adenocarcinomas.19 For colon cancers, the presence or absence of the mutation in KRAS has important implications for patients as it determines whether the tumor will respond to specific chemotherapy agents.20 The presence of the KRAS gene is currently determined by complex molecular testing of tumor tissue.21 However, we assessed the potential of ML to determine whether the mutation is present by computerized morphologic analysis alone. Our last experiment examined the ability of the Apple and Google platforms to differentiate between adenocarcinomas of lung origin vs colon origin. This has potential utility in determining the site of origin of metastatic carcinoma.22

 

 

Methods

Fifty cases of lung SCC, 50 cases of lung adenocarcinoma, and 50 cases of colon adenocarcinoma were randomly retrieved from our molecular database. Twenty-five colon adenocarcinoma cases were positive for mutation in KRAS, while 25 cases were negative for mutation in KRAS. Seven hundred fifty total images of lung tissue (250 benign lung tissue, 250 lung adenocarcinomas, and 250 lung SCCs) and 500 total images of colon tissue (250 benign colon tissue and 250 colon adenocarcinoma) were obtained using a Leica Microscope MC190 HD Camera (Wetzlar, Germany) connected to an Olympus BX41 microscope (Center Valley, PA) and the Leica Acquire 9072 software for Apple computers. All the images were captured at a resolution of 1024 x 768 pixels using a 60x dry objective. Lung tissue images were captured and saved on a 2012 Apple MacBook Pro computer, and colon images were captured and saved on a 2011 Apple iMac computer. Both computers were running macOS v10.13.

Creating Image Classifier Models Using Apple Create ML

Apple Create ML is a suite of products that use various tools to create and train custom ML models on Apple computers.15 The suite contains many features, including image classification to train a ML model to classify images, natural language processing to classify natural language text, and tabular data to train models that deal with labeling information or estimating new quantities. We used Create ML Image Classification to create image classifier models for our project (Appendix A).

fed03610456_appendix_ab.png

Creating ML Modules Using Google Cloud AutoML Vision Beta

Google Cloud AutoML is a suite of machine learning products, including AutoML Vision, AutoML Natural Language and AutoML Translation.14 All Cloud AutoML machine learning products were in beta version at the time of experimentation. We used Cloud AutoML Vision beta to create ML modules for our project. Unlike Apple Create ML, which is run on a local Apple computer, the Google Cloud AutoML is run online using a Google Cloud account. There are no minimum specifications requirements for the local computer since it is using the cloud-based architecture (Appendix B).

 

Experiment 1

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect and subclassify NSCLC based on the histopathologic images. We created 3 classes of images (250 images each): benign lung tissue, lung adenocarcinoma, and lung SCC.

Experiment 2

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between normal lung tissue and NSCLC histopathologic images with 50/50 mixture of lung adenocarcinoma and lung SCC. We created 2 classes of images (250 images each): benign lung tissue and lung NSCLC.

Experiment 3

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and lung SCC histopathologic images. We created 2 classes of images (250 images each): adenocarcinoma and SCC.

Experiment 4

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to detect colon cancer histopathologic images regardless of mutation in KRAS status. We created 2 classes of images (250 images each): benign colon tissue and colon adenocarcinoma.

 

 

Experiment 5

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between colon adenocarcinoma with mutations in KRAS and colon adenocarcinoma without the mutation in KRAS histopathologic images. We created 2 classes of images (125 images each): colon adenocarcinoma cases with mutation in KRAS and colon adenocarcinoma cases without the mutation in KRAS.

fed03610456_t.png

Experiment 6

We compared Apple Create ML Image Classifier and Google AutoML Vision in their ability to differentiate between lung adenocarcinoma and colon adenocarcinoma histopathologic images. We created 2 classes of images (250 images each): colon adenocarcinoma lung adenocarcinoma.

Results

Twelve machine learning models were created in 6 experiments using the Apple Create ML and the Google AutoML (Table). To investigate recall and precision differences between the Apple and the Google ML algorithms, we performed 2-tailed distribution, paired t tests. No statistically significant differences were found (P = .52 for recall and .60 for precision).

fed03610456_f1.png

fed03610456_f2.png

Overall, each model performed well in distinguishing between normal and neoplastic tissue for both lung and colon cancers. In subclassifying NSCLC into adenocarcinoma and SCC, the models were shown to have high levels of precision and recall. The models also were successful in distinguishing between lung and colonic origin of adenocarcinoma (Figures 1-4). However, both systems had trouble discerning colon adenocarcinoma with mutations in KRAS from adenocarcinoma without mutations in KRAS.

 

Discussion

Image classifier models using ML algorithms hold a promising future to revolutionize the health care field. ML products, such as those modules offered by Apple and Google, are easy to use and have a simple graphic user interface to allow individuals to train models to perform humanlike tasks in real time. In our experiments, we compared multiple algorithms to determine their ability to differentiate and subclassify histopathologic images with high precision and recall using common scenarios in treating veteran patients.

fed03610456_f3.png

Analysis of the results revealed high precision and recall values illustrating the models’ ability to differentiate and detect benign lung tissue from lung SCC and lung adenocarcinoma in ML model 1, benign lung from NSCLC carcinoma in ML model 2, and benign colon from colonic adenocarcinoma in ML model 4. In ML model 3 and 6, both ML algorithms performed at a high level to differentiate lung SCC from lung adenocarcinoma and lung adenocarcinoma from colonic adenocarcinoma, respectively. Of note, ML model 5 had the lowest precision and recall values across both algorithms demonstrating the models’ limited utility in predicting molecular profiles, such as mutations in KRAS as tested here. This is not surprising as pathologists currently require complex molecular tests to detect mutations in KRAS reliably in colon cancer.

fed03610456_f4.png

Both modules require minimal programming experience and are easy to use. In our comparison, we demonstrated critical distinguishing characteristics that differentiate the 2 products.

Apple Create ML image classifier is available for use on local Mac computers that use Xcode version 10 and macOS 10.14 or later, with just 3 lines of code required to perform computations. Although this product is limited to Apple computers, it is free to use, and images are stored on the computer hard drive. Of unique significance on the Apple system platform, images can be augmented to alter their appearance to enhance model training. For example, imported images can be cropped, rotated, blurred, and flipped, in order to optimize the model’s training abilities to recognize test images and perform pattern recognition. This feature is not as readily available on the Google platform. Apple Create ML Image classifier’s default training set consists of 75% of total imported images with 5% of the total images being randomly used as a validation set. The remaining 20% of images comprise the testing set. The module’s computational analysis to train the model is achieved in about 2 minutes on average. The score threshold is set at 50% and cannot be manipulated for each image class as in Google AutoML Vision.

Google AutoML Vision is open and can be accessed from many devices. It stores images on remote Google servers but requires computing fees after a $300 credit for 12 months. On AutoML Vision, random 80% of the total images are used in the training set, 10% are used in the validation set, and 10% are used in the testing set. It is important to highlight the different percentages used in the default settings on the respective modules. The time to train the Google AutoML Vision with default computational power is longer on average than Apple Create ML, with about 8 minutes required to train the machine learning module. However, it is possible to choose more computational power for an additional fee and decrease module training time. The user will receive e-mail alerts when the computer time begins and is completed. The computation time is calculated by subtracting the time of the initial e-mail from the final e-mail.

Based on our calculations, we determined there was no significant difference between the 2 machine learning algorithms tested at the default settings with recall and precision values obtained. These findings demonstrate the promise of using a ML algorithm to assist in the performance of human tasks and behaviors, specifically the diagnosis of histopathologic images. These results have numerous potential uses in clinical medicine. ML algorithms have been successfully applied to diagnostic and prognostic endeavors in pathology,23-28 dermatology,29-31 ophthalmology,32 cardiology,33 and radiology.34-36

Pathologists often use additional tests, such as special staining of tissues or molecular tests, to assist with accurate classification of tumors. ML platforms offer the potential of an additional tool for pathologists to use along with human microscopic interpretation.37,38 In addition, the number of pathologists in the US is dramatically decreasing, and many other countries have marked physician shortages, especially in fields of specialized training such as pathology.39-42 These models could readily assist physicians in underserved countries and impact shortages of pathologists elsewhere by providing more specific diagnoses in an expedited manner.43

Finally, although we have explored the application of these platforms in common cancer scenarios, great potential exists to use similar techniques in the detection of other conditions. These include the potential for classification and risk assessment of precancerous lesions, infectious processes in tissue (eg, detection of tuberculosis or malaria),24,44 inflammatory conditions (eg, arthritis subtypes, gout),45 blood disorders (eg, abnormal blood cell morphology),46 and many others. The potential of these technologies to improve health care delivery to veteran patients seems to be limited only by the imagination of the user.47

Regarding the limited effectiveness in determining the presence or absence of mutations in KRAS for colon adenocarcinoma, it is mentioned that currently pathologists rely on complex molecular tests to detect the mutations at the DNA level.21 It is possible that the use of more extensive training data sets may improve recall and precision in cases such as these and warrants further study. Our experiments were limited to the stipulations placed by the free trial software agreements; no costs were expended to use the algorithms, though an Apple computer was required.

 

 

Conclusion

We have demonstrated the successful application of 2 readily available ML platforms in providing diagnostic guidance in differentiation between common cancer conditions in veteran patient populations. Although both platforms performed very well with no statistically significant differences in results, some distinctions are worth noting. Apple Create ML can be used on local computers but is limited to an Apple operating system. Google AutoML is not platform-specific but runs only via Google Cloud with associated computational fees. Using these readily available models, we demonstrated the vast potential of AI in diagnostic pathology. The application of AI to clinical medicine remains in the very early stages. The VA is uniquely poised to provide leadership as AI technologies will continue to dramatically change the future of health care, both in veteran and nonveteran patients nationwide.

Acknowledgments

The authors thank Paul Borkowski for his constructive criticism and proofreading of this manuscript. This material is the result of work supported with resources and the use of facilities at the James A. Haley Veterans’ Hospital.

References

1. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87-91.

2. Trump D. Accelerating America’s leadership in artificial intelligence. https://www.whitehouse.gov/articles/accelerating-americas-leadership-in-artificial-intelligence. Published February 11, 2019. Accessed September 4, 2019.

3. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229.

4. SAS Users Group International. Neural networks and statistical models. In: Sarle WS. Proceedings of the Nineteenth Annual SAS Users Group International Conference. SAS Institute: Cary, North Carolina; 1994:1538-1550. http://www.sascommunity.org/sugi/SUGI94/Sugi-94-255%20Sarle.pdf. Accessed September 16, 2019.

5. Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks. 2015;61:85-117.

6. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444.

7. Jiang F, Jiang Y, Li H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243.

8. Erickson BJ, Korfiatis P, Akkus Z, Kline TL. Machine learning for medical imaging. Radiographics. 2017;37(2):505-515.

9. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930.

10. Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform. 2016;7(1):29.

11. Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. Presented at: IEEE Conference on Computer Vision and Pattern Recognition, 2014. http://openaccess.thecvf.com/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html. Accessed September 4, 2019.

12. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging. 2016;35(5):1285-1298.

13. Tajbakhsh N, Shin JY, Gurudu SR, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging. 2016;35(5):1299-1312.

14. Cloud AutoML. https://cloud.google.com/automl. Accessed September 4, 2019.

15. Create ML. https://developer.apple.com/documentation/createml. Accessed September 4, 2019.

16. Zullig LL, Sims KJ, McNeil R, et al. Cancer incidence among patients of the U.S. Veterans Affairs Health Care System: 2010 Update. Mil Med. 2017;182(7):e1883-e1891. 17. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. https://arxiv.org/ftp/arxiv/papers/1808/1808.08230.pdf. Accessed September 4, 2019.

18. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Revised January 15,2019. Accessed September 4, 2019.

19. Armaghany T, Wilson JD, Chu Q, Mills G. Genetic alterations in colorectal cancer. Gastrointest Cancer Res. 2012;5(1):19-27.

20. Herzig DO, Tsikitis VL. Molecular markers for colon diagnosis, prognosis and targeted therapy. J Surg Oncol. 2015;111(1):96-102.

21. Ma W, Brodie S, Agersborg S, Funari VA, Albitar M. Significant improvement in detecting BRAF, KRAS, and EGFR mutations using next-generation sequencing as compared with FDA-cleared kits. Mol Diagn Ther. 2017;21(5):571-579.

22. Greco FA. Molecular diagnosis of the tissue of origin in cancer of unknown primary site: useful in patient management. Curr Treat Options Oncol. 2013;14(4):634-642.

23. Bejnordi BE, Veta M, van Diest PJ, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210.

24. Xiong Y, Ba X, Hou A, Zhang K, Chen L, Li T. Automatic detection of mycobacterium tuberculosis using artificial intelligence. J Thorac Dis. 2018;10(3):1936-1940.

25. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450.

26. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567.

27. Ertosun MG, Rubin DL. Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. AMIA Annu Symp Proc. 2015;2015:1899-1908.

28. Wahab N, Khan A, Lee YS. Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med. 2017;85:86-97.

29. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.

30. Han SS, Park GH, Lim W, et al. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018;13(1):e0191493.

31. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br J Dermatol. 2019;180(2):373-381.

32. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2010.

33. Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017;12(4):e0174944.

34. Cheng J-Z, Ni D, Chou Y-H, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6(1):24454.

35. Wang X, Yang W, Weinreb J, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning. Sci Rep. 2017;7(1):15415.

36. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582.

37. Bardou D, Zhang K, Ahmad SM. Classification of breast cancer based on histology images using convolutional neural networks. IEEE Access. 2018;6(6):24680-24693.

38. Sheikhzadeh F, Ward RK, van Niekerk D, Guillaud M. Automatic labeling of molecular biomarkers of immunohistochemistry images using fully convolutional networks. PLoS One. 2018;13(1):e0190783.

39. Metter DM, Colgan TJ, Leung ST, Timmons CF, Park JY. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw Open. 2019;2(5):e194337.

40. Benediktsson, H, Whitelaw J, Roy I. Pathology services in developing countries: a challenge. Arch Pathol Lab Med. 2007;131(11):1636-1639.

41. Graves D. The impact of the pathology workforce crisis on acute health care. Aust Health Rev. 2007;31(suppl 1):S28-S30.

42. NHS pathology shortages cause cancer diagnosis delays. https://www.gmjournal.co.uk/nhs-pathology-shortages-are-causing-cancer-diagnosis-delays. Published September 18, 2018. Accessed September 4, 2019.

43. Abbott LM, Smith SD. Smartphone apps for skin cancer diagnosis: Implications for patients and practitioners. Australas J Dermatol. 2018;59(3):168-170.

44. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G. Image analysis and machine learning for detecting malaria. Transl Res. 2018;194:36-55.

45. Orange DE, Agius P, DiCarlo EF, et al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheumatol. 2018;70(5):690-701.

46. Rodellar J, Alférez S, Acevedo A, Molina A, Merino A. Image processing and machine learning in the morphological analysis of blood cells. Int J Lab Hematol. 2018;40(suppl 1):46-53.

47. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.

References

1. Moor J. The Dartmouth College artificial intelligence conference: the next fifty years. AI Mag. 2006;27(4):87-91.

2. Trump D. Accelerating America’s leadership in artificial intelligence. https://www.whitehouse.gov/articles/accelerating-americas-leadership-in-artificial-intelligence. Published February 11, 2019. Accessed September 4, 2019.

3. Samuel AL. Some studies in machine learning using the game of checkers. IBM J Res Dev. 1959;3(3):210-229.

4. SAS Users Group International. Neural networks and statistical models. In: Sarle WS. Proceedings of the Nineteenth Annual SAS Users Group International Conference. SAS Institute: Cary, North Carolina; 1994:1538-1550. http://www.sascommunity.org/sugi/SUGI94/Sugi-94-255%20Sarle.pdf. Accessed September 16, 2019.

5. Schmidhuber J. Deep learning in neural networks: an overview. Neural Networks. 2015;61:85-117.

6. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521(7553):436-444.

7. Jiang F, Jiang Y, Li H, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. 2017;2(4):230-243.

8. Erickson BJ, Korfiatis P, Akkus Z, Kline TL. Machine learning for medical imaging. Radiographics. 2017;37(2):505-515.

9. Deo RC. Machine learning in medicine. Circulation. 2015;132(20):1920-1930.

10. Janowczyk A, Madabhushi A. Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J Pathol Inform. 2016;7(1):29.

11. Oquab M, Bottou L, Laptev I, Sivic J. Learning and transferring mid-level image representations using convolutional neural networks. Presented at: IEEE Conference on Computer Vision and Pattern Recognition, 2014. http://openaccess.thecvf.com/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html. Accessed September 4, 2019.

12. Shin HC, Roth HR, Gao M, et al. Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE Trans Med Imaging. 2016;35(5):1285-1298.

13. Tajbakhsh N, Shin JY, Gurudu SR, et al. Convolutional neural networks for medical image analysis: full training or fine tuning? IEEE Trans Med Imaging. 2016;35(5):1299-1312.

14. Cloud AutoML. https://cloud.google.com/automl. Accessed September 4, 2019.

15. Create ML. https://developer.apple.com/documentation/createml. Accessed September 4, 2019.

16. Zullig LL, Sims KJ, McNeil R, et al. Cancer incidence among patients of the U.S. Veterans Affairs Health Care System: 2010 Update. Mil Med. 2017;182(7):e1883-e1891. 17. Borkowski AA, Wilson CP, Borkowski SA, Deland LA, Mastorides SM. Using Apple machine learning algorithms to detect and subclassify non-small cell lung cancer. https://arxiv.org/ftp/arxiv/papers/1808/1808.08230.pdf. Accessed September 4, 2019.

18. Borkowski AA, Wilson CP, Borkowski SA, Thomas LB, Deland LA, Mastorides SM. Apple machine learning algorithms successfully detect colon cancer but fail to predict KRAS mutation status. http://arxiv.org/abs/1812.04660. Revised January 15,2019. Accessed September 4, 2019.

19. Armaghany T, Wilson JD, Chu Q, Mills G. Genetic alterations in colorectal cancer. Gastrointest Cancer Res. 2012;5(1):19-27.

20. Herzig DO, Tsikitis VL. Molecular markers for colon diagnosis, prognosis and targeted therapy. J Surg Oncol. 2015;111(1):96-102.

21. Ma W, Brodie S, Agersborg S, Funari VA, Albitar M. Significant improvement in detecting BRAF, KRAS, and EGFR mutations using next-generation sequencing as compared with FDA-cleared kits. Mol Diagn Ther. 2017;21(5):571-579.

22. Greco FA. Molecular diagnosis of the tissue of origin in cancer of unknown primary site: useful in patient management. Curr Treat Options Oncol. 2013;14(4):634-642.

23. Bejnordi BE, Veta M, van Diest PJ, et al. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA. 2017;318(22):2199-2210.

24. Xiong Y, Ba X, Hou A, Zhang K, Chen L, Li T. Automatic detection of mycobacterium tuberculosis using artificial intelligence. J Thorac Dis. 2018;10(3):1936-1940.

25. Cruz-Roa A, Gilmore H, Basavanhally A, et al. Accurate and reproducible invasive breast cancer detection in whole-slide images: a deep learning approach for quantifying tumor extent. Sci Rep. 2017;7:46450.

26. Coudray N, Ocampo PS, Sakellaropoulos T, et al. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat Med. 2018;24(10):1559-1567.

27. Ertosun MG, Rubin DL. Automated grading of gliomas using deep learning in digital pathology images: a modular approach with ensemble of convolutional neural networks. AMIA Annu Symp Proc. 2015;2015:1899-1908.

28. Wahab N, Khan A, Lee YS. Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Comput Biol Med. 2017;85:86-97.

29. Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115-118.

30. Han SS, Park GH, Lim W, et al. Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PLoS One. 2018;13(1):e0191493.

31. Fujisawa Y, Otomo Y, Ogata Y, et al. Deep-learning-based, computer-aided classifier developed with a small dataset of clinical images surpasses board-certified dermatologists in skin tumour diagnosis. Br J Dermatol. 2019;180(2):373-381.

32. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316(22):2402-2010.

33. Weng SF, Reps J, Kai J, Garibaldi JM, Qureshi N. Can machine-learning improve cardiovascular risk prediction using routine clinical data? PLoS One. 2017;12(4):e0174944.

34. Cheng J-Z, Ni D, Chou Y-H, et al. Computer-aided diagnosis with deep learning architecture: applications to breast lesions in US images and pulmonary nodules in CT scans. Sci Rep. 2016;6(1):24454.

35. Wang X, Yang W, Weinreb J, et al. Searching for prostate cancer by fully automated magnetic resonance imaging classification: deep learning versus non-deep learning. Sci Rep. 2017;7(1):15415.

36. Lakhani P, Sundaram B. Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology. 2017;284(2):574-582.

37. Bardou D, Zhang K, Ahmad SM. Classification of breast cancer based on histology images using convolutional neural networks. IEEE Access. 2018;6(6):24680-24693.

38. Sheikhzadeh F, Ward RK, van Niekerk D, Guillaud M. Automatic labeling of molecular biomarkers of immunohistochemistry images using fully convolutional networks. PLoS One. 2018;13(1):e0190783.

39. Metter DM, Colgan TJ, Leung ST, Timmons CF, Park JY. Trends in the US and Canadian pathologist workforces from 2007 to 2017. JAMA Netw Open. 2019;2(5):e194337.

40. Benediktsson, H, Whitelaw J, Roy I. Pathology services in developing countries: a challenge. Arch Pathol Lab Med. 2007;131(11):1636-1639.

41. Graves D. The impact of the pathology workforce crisis on acute health care. Aust Health Rev. 2007;31(suppl 1):S28-S30.

42. NHS pathology shortages cause cancer diagnosis delays. https://www.gmjournal.co.uk/nhs-pathology-shortages-are-causing-cancer-diagnosis-delays. Published September 18, 2018. Accessed September 4, 2019.

43. Abbott LM, Smith SD. Smartphone apps for skin cancer diagnosis: Implications for patients and practitioners. Australas J Dermatol. 2018;59(3):168-170.

44. Poostchi M, Silamut K, Maude RJ, Jaeger S, Thoma G. Image analysis and machine learning for detecting malaria. Transl Res. 2018;194:36-55.

45. Orange DE, Agius P, DiCarlo EF, et al. Identification of three rheumatoid arthritis disease subtypes by machine learning integration of synovial histologic features and RNA sequencing data. Arthritis Rheumatol. 2018;70(5):690-701.

46. Rodellar J, Alférez S, Acevedo A, Molina A, Merino A. Image processing and machine learning in the morphological analysis of blood cells. Int J Lab Hematol. 2018;40(suppl 1):46-53.

47. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.

Issue
Federal Practitioner - 36(10)a
Issue
Federal Practitioner - 36(10)a
Page Number
456-463
Page Number
456-463
Publications
Publications
Topics
Article Type
Sections
Disallow All Ads
Content Gating
No Gating (article Unlocked/Free)
Alternative CME
Disqus Comments
Default
Use ProPublica
Hide sidebar & use full width
render the right sidebar.
Article PDF Media