Efficient adversarial debiasing with concept activation vector — Medical image case-studies

Ramon Correa; Khushbu Pahwa; Bhavik Patel; Celine M. Vachon; Judy W. Gichoya; Imon Banerjee

doi:10.1016/j.jbi.2023.104548

Efficient adversarial debiasing with concept activation vector — Medical image case-studies

Ramon Correa, Khushbu Pahwa, Bhavik Patel, Celine M. Vachon, Judy W. Gichoya, Imon Banerjee

Research output: Contribution to journal › Article › peer-review

Abstract

Background: A major hurdle for the real time deployment of the AI models is ensuring trustworthiness of these models for the unseen population. More often than not, these complex models are black boxes in which promising results are generated. However, when scrutinized, these models begin to reveal implicit biases during the decision making, particularly for the minority subgroups. Method: We develop an efficient adversarial de-biasing approach with partial learning by incorporating the existing concept activation vectors (CAV) methodology, to reduce racial disparities while preserving the performance of the targeted task. CAV is originally a model interpretability technique which we adopted to identify convolution layers responsible for learning race and only fine-tune up to that layer instead of fine-tuning the complete network, limiting the drop in performance Results: The methodology has been evaluated on two independent medical image case-studies - chest X-ray and mammograms, and we also performed external validation on a different racial population. On the external datasets for the chest X-ray use-case, debiased models (averaged AUC 0.87 ) outperformed the baseline convolution models (averaged AUC 0.57 ) as well as the models trained with the popular fine-tuning strategy (averaged AUC 0.81). Moreover, the mammogram models is debiased using a single dataset (white, black and Asian) and improved the performance on an external datasets (averaged AUC 0.8 to 0.86 ) with completely different population (primarily Hispanic patients). Conclusion: In this study, we demonstrated that the adversarial models trained only with internal data performed equally or often outperformed the standard fine-tuning strategy with data from an external setting. The adversarial training approach described can be applied regardless of predictor's model architecture, as long as the convolution model is trained using a gradient-based method. We release the training code with academic open-source license - https://github.com/ramon349/JBI2023_TCAV_debiasing.

Original language	English (US)
Article number	104548
Journal	Journal of Biomedical Informatics
Volume	149
DOIs	https://doi.org/10.1016/j.jbi.2023.104548
State	Published - Jan 2024

Keywords

Adversarial fairness
Concept activation vector
Debiasing
Mammogram images
X-ray images

ASJC Scopus subject areas

Health Informatics
Computer Science Applications

Access to Document

10.1016/j.jbi.2023.104548

Cite this

@article{54ff48716fdf4f088f0535580537af48,

title = "Efficient adversarial debiasing with concept activation vector — Medical image case-studies",

abstract = "Background: A major hurdle for the real time deployment of the AI models is ensuring trustworthiness of these models for the unseen population. More often than not, these complex models are black boxes in which promising results are generated. However, when scrutinized, these models begin to reveal implicit biases during the decision making, particularly for the minority subgroups. Method: We develop an efficient adversarial de-biasing approach with partial learning by incorporating the existing concept activation vectors (CAV) methodology, to reduce racial disparities while preserving the performance of the targeted task. CAV is originally a model interpretability technique which we adopted to identify convolution layers responsible for learning race and only fine-tune up to that layer instead of fine-tuning the complete network, limiting the drop in performance Results: The methodology has been evaluated on two independent medical image case-studies - chest X-ray and mammograms, and we also performed external validation on a different racial population. On the external datasets for the chest X-ray use-case, debiased models (averaged AUC 0.87 ) outperformed the baseline convolution models (averaged AUC 0.57 ) as well as the models trained with the popular fine-tuning strategy (averaged AUC 0.81). Moreover, the mammogram models is debiased using a single dataset (white, black and Asian) and improved the performance on an external datasets (averaged AUC 0.8 to 0.86 ) with completely different population (primarily Hispanic patients). Conclusion: In this study, we demonstrated that the adversarial models trained only with internal data performed equally or often outperformed the standard fine-tuning strategy with data from an external setting. The adversarial training approach described can be applied regardless of predictor's model architecture, as long as the convolution model is trained using a gradient-based method. We release the training code with academic open-source license - https://github.com/ramon349/JBI2023_TCAV_debiasing.",

keywords = "Adversarial fairness, Concept activation vector, Debiasing, Mammogram images, X-ray images",

author = "Ramon Correa and Khushbu Pahwa and Bhavik Patel and Vachon, {Celine M.} and Gichoya, {Judy W.} and Imon Banerjee",

note = "Publisher Copyright: {\textcopyright} 2023 Elsevier Inc.",

year = "2024",

month = jan,

doi = "10.1016/j.jbi.2023.104548",

language = "English (US)",

volume = "149",

journal = "Journal of Biomedical Informatics",

issn = "1532-0464",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Efficient adversarial debiasing with concept activation vector — Medical image case-studies

AU - Correa, Ramon

AU - Pahwa, Khushbu

AU - Patel, Bhavik

AU - Vachon, Celine M.

AU - Gichoya, Judy W.

AU - Banerjee, Imon

PY - 2024/1

Y1 - 2024/1

N2 - Background: A major hurdle for the real time deployment of the AI models is ensuring trustworthiness of these models for the unseen population. More often than not, these complex models are black boxes in which promising results are generated. However, when scrutinized, these models begin to reveal implicit biases during the decision making, particularly for the minority subgroups. Method: We develop an efficient adversarial de-biasing approach with partial learning by incorporating the existing concept activation vectors (CAV) methodology, to reduce racial disparities while preserving the performance of the targeted task. CAV is originally a model interpretability technique which we adopted to identify convolution layers responsible for learning race and only fine-tune up to that layer instead of fine-tuning the complete network, limiting the drop in performance Results: The methodology has been evaluated on two independent medical image case-studies - chest X-ray and mammograms, and we also performed external validation on a different racial population. On the external datasets for the chest X-ray use-case, debiased models (averaged AUC 0.87 ) outperformed the baseline convolution models (averaged AUC 0.57 ) as well as the models trained with the popular fine-tuning strategy (averaged AUC 0.81). Moreover, the mammogram models is debiased using a single dataset (white, black and Asian) and improved the performance on an external datasets (averaged AUC 0.8 to 0.86 ) with completely different population (primarily Hispanic patients). Conclusion: In this study, we demonstrated that the adversarial models trained only with internal data performed equally or often outperformed the standard fine-tuning strategy with data from an external setting. The adversarial training approach described can be applied regardless of predictor's model architecture, as long as the convolution model is trained using a gradient-based method. We release the training code with academic open-source license - https://github.com/ramon349/JBI2023_TCAV_debiasing.

AB - Background: A major hurdle for the real time deployment of the AI models is ensuring trustworthiness of these models for the unseen population. More often than not, these complex models are black boxes in which promising results are generated. However, when scrutinized, these models begin to reveal implicit biases during the decision making, particularly for the minority subgroups. Method: We develop an efficient adversarial de-biasing approach with partial learning by incorporating the existing concept activation vectors (CAV) methodology, to reduce racial disparities while preserving the performance of the targeted task. CAV is originally a model interpretability technique which we adopted to identify convolution layers responsible for learning race and only fine-tune up to that layer instead of fine-tuning the complete network, limiting the drop in performance Results: The methodology has been evaluated on two independent medical image case-studies - chest X-ray and mammograms, and we also performed external validation on a different racial population. On the external datasets for the chest X-ray use-case, debiased models (averaged AUC 0.87 ) outperformed the baseline convolution models (averaged AUC 0.57 ) as well as the models trained with the popular fine-tuning strategy (averaged AUC 0.81). Moreover, the mammogram models is debiased using a single dataset (white, black and Asian) and improved the performance on an external datasets (averaged AUC 0.8 to 0.86 ) with completely different population (primarily Hispanic patients). Conclusion: In this study, we demonstrated that the adversarial models trained only with internal data performed equally or often outperformed the standard fine-tuning strategy with data from an external setting. The adversarial training approach described can be applied regardless of predictor's model architecture, as long as the convolution model is trained using a gradient-based method. We release the training code with academic open-source license - https://github.com/ramon349/JBI2023_TCAV_debiasing.

KW - Adversarial fairness

KW - Concept activation vector

KW - Debiasing

KW - Mammogram images

KW - X-ray images

UR - http://www.scopus.com/inward/record.url?scp=85179393165&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85179393165&partnerID=8YFLogxK

U2 - 10.1016/j.jbi.2023.104548

DO - 10.1016/j.jbi.2023.104548

M3 - Article

C2 - 38043883

AN - SCOPUS:85179393165

SN - 1532-0464

VL - 149

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

M1 - 104548

ER -

Efficient adversarial debiasing with concept activation vector — Medical image case-studies

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this