“Shortcuts” Causing Bias in Radiology Artificial Intelligence: Causes, Evaluation, and Mitigation

Imon Banerjee; Kamanasish Bhattacharjee; John L. Burns; Hari Trivedi; Saptarshi Purkayastha; Laleh Seyyed-Kalantari; Bhavik N. Patel; Rakesh Shiradkar; Judy Gichoya

doi:10.1016/j.jacr.2023.06.025

“Shortcuts” Causing Bias in Radiology Artificial Intelligence: Causes, Evaluation, and Mitigation

Imon Banerjee, Kamanasish Bhattacharjee, John L. Burns, Hari Trivedi, Saptarshi Purkayastha, Laleh Seyyed-Kalantari, Bhavik N. Patel, Rakesh Shiradkar, Judy Gichoya

Diagnostic Radiology

Research output: Contribution to journal › Review article › peer-review

Abstract

Despite the expert-level performance of artificial intelligence (AI) models for various medical imaging tasks, real-world performance failures with disparate outputs for various subgroups limit the usefulness of AI in improving patients’ lives. Many definitions of fairness have been proposed, with discussions of various tensions that arise in the choice of an appropriate metric to use to evaluate bias; for example, should one aim for individual or group fairness? One central observation is that AI models apply “shortcut learning” whereby spurious features (such as chest tubes and portable radiographic markers on intensive care unit chest radiography) on medical images are used for prediction instead of identifying true pathology. Moreover, AI has been shown to have a remarkable ability to detect protected attributes of age, sex, and race, while the same models demonstrate bias against historically underserved subgroups of age, sex, and race in disease diagnosis. Therefore, an AI model may take shortcut predictions from these correlations and subsequently generate an outcome that is biased toward certain subgroups even when protected attributes are not explicitly used as inputs into the model. As a result, these subgroups became nonprivileged subgroups. In this review, the authors discuss the various types of bias from shortcut learning that may occur at different phases of AI model development, including data bias, modeling bias, and inference bias. The authors thereafter summarize various tool kits that can be used to evaluate and mitigate bias and note that these have largely been applied to nonmedical domains and require more evaluation for medical AI. The authors then summarize current techniques for mitigating bias from preprocessing (data-centric solutions) and during model development (computational solutions) and postprocessing (recalibration of learning). Ongoing legal changes where the use of a biased model will be penalized highlight the necessity of understanding, detecting, and mitigating biases from shortcut learning and will require diverse research teams looking at the whole AI pipeline.

Original language	English (US)
Pages (from-to)	842-851
Number of pages	10
Journal	Journal of the American College of Radiology
Volume	20
Issue number	9
DOIs	https://doi.org/10.1016/j.jacr.2023.06.025
State	Published - Sep 2023

Keywords

Artificial Intelligence
machine learning

ASJC Scopus subject areas

Radiology Nuclear Medicine and imaging

Access to Document

10.1016/j.jacr.2023.06.025

Cite this

@article{5047395765eb450caa1eb64c2cc3b150,

title = "“Shortcuts” Causing Bias in Radiology Artificial Intelligence: Causes, Evaluation, and Mitigation",

abstract = "Despite the expert-level performance of artificial intelligence (AI) models for various medical imaging tasks, real-world performance failures with disparate outputs for various subgroups limit the usefulness of AI in improving patients{\textquoteright} lives. Many definitions of fairness have been proposed, with discussions of various tensions that arise in the choice of an appropriate metric to use to evaluate bias; for example, should one aim for individual or group fairness? One central observation is that AI models apply “shortcut learning” whereby spurious features (such as chest tubes and portable radiographic markers on intensive care unit chest radiography) on medical images are used for prediction instead of identifying true pathology. Moreover, AI has been shown to have a remarkable ability to detect protected attributes of age, sex, and race, while the same models demonstrate bias against historically underserved subgroups of age, sex, and race in disease diagnosis. Therefore, an AI model may take shortcut predictions from these correlations and subsequently generate an outcome that is biased toward certain subgroups even when protected attributes are not explicitly used as inputs into the model. As a result, these subgroups became nonprivileged subgroups. In this review, the authors discuss the various types of bias from shortcut learning that may occur at different phases of AI model development, including data bias, modeling bias, and inference bias. The authors thereafter summarize various tool kits that can be used to evaluate and mitigate bias and note that these have largely been applied to nonmedical domains and require more evaluation for medical AI. The authors then summarize current techniques for mitigating bias from preprocessing (data-centric solutions) and during model development (computational solutions) and postprocessing (recalibration of learning). Ongoing legal changes where the use of a biased model will be penalized highlight the necessity of understanding, detecting, and mitigating biases from shortcut learning and will require diverse research teams looking at the whole AI pipeline.",

keywords = "Artificial Intelligence, machine learning",

author = "Imon Banerjee and Kamanasish Bhattacharjee and Burns, {John L.} and Hari Trivedi and Saptarshi Purkayastha and Laleh Seyyed-Kalantari and Patel, {Bhavik N.} and Rakesh Shiradkar and Judy Gichoya",

note = "Publisher Copyright: {\textcopyright} 2023",

year = "2023",

month = sep,

doi = "10.1016/j.jacr.2023.06.025",

language = "English (US)",

volume = "20",

pages = "842--851",

journal = "Journal of the American College of Radiology",

issn = "1546-1440",

publisher = "Elsevier BV",

number = "9",

}

TY - JOUR

T1 - “Shortcuts” Causing Bias in Radiology Artificial Intelligence

T2 - Causes, Evaluation, and Mitigation

AU - Banerjee, Imon

AU - Bhattacharjee, Kamanasish

AU - Burns, John L.

AU - Trivedi, Hari

AU - Purkayastha, Saptarshi

AU - Seyyed-Kalantari, Laleh

AU - Patel, Bhavik N.

AU - Shiradkar, Rakesh

AU - Gichoya, Judy

PY - 2023/9

Y1 - 2023/9

N2 - Despite the expert-level performance of artificial intelligence (AI) models for various medical imaging tasks, real-world performance failures with disparate outputs for various subgroups limit the usefulness of AI in improving patients’ lives. Many definitions of fairness have been proposed, with discussions of various tensions that arise in the choice of an appropriate metric to use to evaluate bias; for example, should one aim for individual or group fairness? One central observation is that AI models apply “shortcut learning” whereby spurious features (such as chest tubes and portable radiographic markers on intensive care unit chest radiography) on medical images are used for prediction instead of identifying true pathology. Moreover, AI has been shown to have a remarkable ability to detect protected attributes of age, sex, and race, while the same models demonstrate bias against historically underserved subgroups of age, sex, and race in disease diagnosis. Therefore, an AI model may take shortcut predictions from these correlations and subsequently generate an outcome that is biased toward certain subgroups even when protected attributes are not explicitly used as inputs into the model. As a result, these subgroups became nonprivileged subgroups. In this review, the authors discuss the various types of bias from shortcut learning that may occur at different phases of AI model development, including data bias, modeling bias, and inference bias. The authors thereafter summarize various tool kits that can be used to evaluate and mitigate bias and note that these have largely been applied to nonmedical domains and require more evaluation for medical AI. The authors then summarize current techniques for mitigating bias from preprocessing (data-centric solutions) and during model development (computational solutions) and postprocessing (recalibration of learning). Ongoing legal changes where the use of a biased model will be penalized highlight the necessity of understanding, detecting, and mitigating biases from shortcut learning and will require diverse research teams looking at the whole AI pipeline.

AB - Despite the expert-level performance of artificial intelligence (AI) models for various medical imaging tasks, real-world performance failures with disparate outputs for various subgroups limit the usefulness of AI in improving patients’ lives. Many definitions of fairness have been proposed, with discussions of various tensions that arise in the choice of an appropriate metric to use to evaluate bias; for example, should one aim for individual or group fairness? One central observation is that AI models apply “shortcut learning” whereby spurious features (such as chest tubes and portable radiographic markers on intensive care unit chest radiography) on medical images are used for prediction instead of identifying true pathology. Moreover, AI has been shown to have a remarkable ability to detect protected attributes of age, sex, and race, while the same models demonstrate bias against historically underserved subgroups of age, sex, and race in disease diagnosis. Therefore, an AI model may take shortcut predictions from these correlations and subsequently generate an outcome that is biased toward certain subgroups even when protected attributes are not explicitly used as inputs into the model. As a result, these subgroups became nonprivileged subgroups. In this review, the authors discuss the various types of bias from shortcut learning that may occur at different phases of AI model development, including data bias, modeling bias, and inference bias. The authors thereafter summarize various tool kits that can be used to evaluate and mitigate bias and note that these have largely been applied to nonmedical domains and require more evaluation for medical AI. The authors then summarize current techniques for mitigating bias from preprocessing (data-centric solutions) and during model development (computational solutions) and postprocessing (recalibration of learning). Ongoing legal changes where the use of a biased model will be penalized highlight the necessity of understanding, detecting, and mitigating biases from shortcut learning and will require diverse research teams looking at the whole AI pipeline.

KW - Artificial Intelligence

KW - machine learning

UR - http://www.scopus.com/inward/record.url?scp=85173754973&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85173754973&partnerID=8YFLogxK

U2 - 10.1016/j.jacr.2023.06.025

DO - 10.1016/j.jacr.2023.06.025

M3 - Review article

C2 - 37506964

AN - SCOPUS:85173754973

SN - 1546-1440

VL - 20

SP - 842

EP - 851

JO - Journal of the American College of Radiology

JF - Journal of the American College of Radiology

IS - 9

ER -

“Shortcuts” Causing Bias in Radiology Artificial Intelligence: Causes, Evaluation, and Mitigation

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this