Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis: A Systematic Review

Mana Moassefi; Pouria Rouzrokh; Gian Marco Conte; Sanaz Vahdati; Tianyuan Fu; Aylin Tahmasebi; Mira Younis; Keyvan Farahani; Amilcare Gentili; Timothy Kline; Felipe C. Kitamura; Yuankai Huo; Shiba Kuanar; Khaled Younis; Bradley J. Erickson; Shahriar Faghani

doi:10.1007/s10278-023-00870-5

Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis: A Systematic Review

Mana Moassefi, Pouria Rouzrokh, Gian Marco Conte, Sanaz Vahdati, Tianyuan Fu, Aylin Tahmasebi, Mira Younis, Keyvan Farahani, Amilcare Gentili, Timothy Kline, Felipe C. Kitamura, Yuankai Huo, Shiba Kuanar, Khaled Younis, Bradley J. Erickson, Shahriar Faghani

Radiology

Research output: Contribution to journal › Review article › peer-review

Abstract

Since 2000, there have been more than 8000 publications on radiology artificial intelligence (AI). AI breakthroughs allow complex tasks to be automated and even performed beyond human capabilities. However, the lack of details on the methods and algorithm code undercuts its scientific value. Many science subfields have recently faced a reproducibility crisis, eroding trust in processes and results, and influencing the rise in retractions of scientific papers. For the same reasons, conducting research in deep learning (DL) also requires reproducibility. Although several valuable manuscript checklists for AI in medical imaging exist, they are not focused specifically on reproducibility. In this study, we conducted a systematic review of recently published papers in the field of DL to evaluate if the description of their methodology could allow the reproducibility of their findings. We focused on the Journal of Digital Imaging (JDI), a specialized journal that publishes papers on AI and medical imaging. We used the keyword “Deep Learning” and collected the articles published between January 2020 and January 2022. We screened all the articles and included the ones which reported the development of a DL tool in medical imaging. We extracted the reported details about the dataset, data handling steps, data splitting, model details, and performance metrics of each included article. We found 148 articles. Eighty were included after screening for articles that reported developing a DL model for medical image analysis. Five studies have made their code publicly available, and 35 studies have utilized publicly available datasets. We provided figures to show the ratio and absolute count of reported items from included studies. According to our cross-sectional study, in JDI publications on DL in medical imaging, authors infrequently report the key elements of their study to make it reproducible.

Original language	English (US)
Pages (from-to)	2306-2312
Number of pages	7
Journal	Journal of Digital Imaging
Volume	36
Issue number	5
DOIs	https://doi.org/10.1007/s10278-023-00870-5
State	Published - Oct 2023

Keywords

Artificial intelligence
Deep learning
Machine learning
Medical imaging
Reproducibility

ASJC Scopus subject areas

Radiological and Ultrasound Technology
Radiology Nuclear Medicine and imaging
Computer Science Applications

Access to Document

10.1007/s10278-023-00870-5

Cite this

Moassefi, M., Rouzrokh, P., Conte, G. M., Vahdati, S., Fu, T., Tahmasebi, A., Younis, M., Farahani, K., Gentili, A., Kline, T., Kitamura, F. C., Huo, Y., Kuanar, S., Younis, K., Erickson, B. J., & Faghani, S. (2023). Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis: A Systematic Review. Journal of Digital Imaging, 36(5), 2306-2312. https://doi.org/10.1007/s10278-023-00870-5

Moassefi, M, Rouzrokh, P, Conte, GM, Vahdati, S, Fu, T, Tahmasebi, A, Younis, M, Farahani, K, Gentili, A, Kline, T, Kitamura, FC, Huo, Y, Kuanar, S, Younis, K, Erickson, BJ & Faghani, S 2023, 'Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis: A Systematic Review', Journal of Digital Imaging, vol. 36, no. 5, pp. 2306-2312. https://doi.org/10.1007/s10278-023-00870-5

@article{47f7c1ea26fb4acabcc344264e31d68a,

title = "Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis: A Systematic Review",

abstract = "Since 2000, there have been more than 8000 publications on radiology artificial intelligence (AI). AI breakthroughs allow complex tasks to be automated and even performed beyond human capabilities. However, the lack of details on the methods and algorithm code undercuts its scientific value. Many science subfields have recently faced a reproducibility crisis, eroding trust in processes and results, and influencing the rise in retractions of scientific papers. For the same reasons, conducting research in deep learning (DL) also requires reproducibility. Although several valuable manuscript checklists for AI in medical imaging exist, they are not focused specifically on reproducibility. In this study, we conducted a systematic review of recently published papers in the field of DL to evaluate if the description of their methodology could allow the reproducibility of their findings. We focused on the Journal of Digital Imaging (JDI), a specialized journal that publishes papers on AI and medical imaging. We used the keyword “Deep Learning” and collected the articles published between January 2020 and January 2022. We screened all the articles and included the ones which reported the development of a DL tool in medical imaging. We extracted the reported details about the dataset, data handling steps, data splitting, model details, and performance metrics of each included article. We found 148 articles. Eighty were included after screening for articles that reported developing a DL model for medical image analysis. Five studies have made their code publicly available, and 35 studies have utilized publicly available datasets. We provided figures to show the ratio and absolute count of reported items from included studies. According to our cross-sectional study, in JDI publications on DL in medical imaging, authors infrequently report the key elements of their study to make it reproducible.",

keywords = "Artificial intelligence, Deep learning, Machine learning, Medical imaging, Reproducibility",

author = "Mana Moassefi and Pouria Rouzrokh and Conte, {Gian Marco} and Sanaz Vahdati and Tianyuan Fu and Aylin Tahmasebi and Mira Younis and Keyvan Farahani and Amilcare Gentili and Timothy Kline and Kitamura, {Felipe C.} and Yuankai Huo and Shiba Kuanar and Khaled Younis and Erickson, {Bradley J.} and Shahriar Faghani",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s) under exclusive licence to Society for Imaging Informatics in Medicine.",

year = "2023",

month = oct,

doi = "10.1007/s10278-023-00870-5",

language = "English (US)",

volume = "36",

pages = "2306--2312",

journal = "Journal of Digital Imaging",

issn = "0897-1889",

publisher = "Springer New York",

number = "5",

}

TY - JOUR

T1 - Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis

T2 - A Systematic Review

AU - Moassefi, Mana

AU - Rouzrokh, Pouria

AU - Conte, Gian Marco

AU - Vahdati, Sanaz

AU - Fu, Tianyuan

AU - Tahmasebi, Aylin

AU - Younis, Mira

AU - Farahani, Keyvan

AU - Gentili, Amilcare

AU - Kline, Timothy

AU - Kitamura, Felipe C.

AU - Huo, Yuankai

AU - Kuanar, Shiba

AU - Younis, Khaled

AU - Erickson, Bradley J.

AU - Faghani, Shahriar

PY - 2023/10

Y1 - 2023/10

N2 - Since 2000, there have been more than 8000 publications on radiology artificial intelligence (AI). AI breakthroughs allow complex tasks to be automated and even performed beyond human capabilities. However, the lack of details on the methods and algorithm code undercuts its scientific value. Many science subfields have recently faced a reproducibility crisis, eroding trust in processes and results, and influencing the rise in retractions of scientific papers. For the same reasons, conducting research in deep learning (DL) also requires reproducibility. Although several valuable manuscript checklists for AI in medical imaging exist, they are not focused specifically on reproducibility. In this study, we conducted a systematic review of recently published papers in the field of DL to evaluate if the description of their methodology could allow the reproducibility of their findings. We focused on the Journal of Digital Imaging (JDI), a specialized journal that publishes papers on AI and medical imaging. We used the keyword “Deep Learning” and collected the articles published between January 2020 and January 2022. We screened all the articles and included the ones which reported the development of a DL tool in medical imaging. We extracted the reported details about the dataset, data handling steps, data splitting, model details, and performance metrics of each included article. We found 148 articles. Eighty were included after screening for articles that reported developing a DL model for medical image analysis. Five studies have made their code publicly available, and 35 studies have utilized publicly available datasets. We provided figures to show the ratio and absolute count of reported items from included studies. According to our cross-sectional study, in JDI publications on DL in medical imaging, authors infrequently report the key elements of their study to make it reproducible.

AB - Since 2000, there have been more than 8000 publications on radiology artificial intelligence (AI). AI breakthroughs allow complex tasks to be automated and even performed beyond human capabilities. However, the lack of details on the methods and algorithm code undercuts its scientific value. Many science subfields have recently faced a reproducibility crisis, eroding trust in processes and results, and influencing the rise in retractions of scientific papers. For the same reasons, conducting research in deep learning (DL) also requires reproducibility. Although several valuable manuscript checklists for AI in medical imaging exist, they are not focused specifically on reproducibility. In this study, we conducted a systematic review of recently published papers in the field of DL to evaluate if the description of their methodology could allow the reproducibility of their findings. We focused on the Journal of Digital Imaging (JDI), a specialized journal that publishes papers on AI and medical imaging. We used the keyword “Deep Learning” and collected the articles published between January 2020 and January 2022. We screened all the articles and included the ones which reported the development of a DL tool in medical imaging. We extracted the reported details about the dataset, data handling steps, data splitting, model details, and performance metrics of each included article. We found 148 articles. Eighty were included after screening for articles that reported developing a DL model for medical image analysis. Five studies have made their code publicly available, and 35 studies have utilized publicly available datasets. We provided figures to show the ratio and absolute count of reported items from included studies. According to our cross-sectional study, in JDI publications on DL in medical imaging, authors infrequently report the key elements of their study to make it reproducible.

KW - Artificial intelligence

KW - Deep learning

KW - Machine learning

KW - Medical imaging

KW - Reproducibility

UR - http://www.scopus.com/inward/record.url?scp=85164008547&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85164008547&partnerID=8YFLogxK

U2 - 10.1007/s10278-023-00870-5

DO - 10.1007/s10278-023-00870-5

M3 - Review article

C2 - 37407841

AN - SCOPUS:85164008547

SN - 0897-1889

VL - 36

SP - 2306

EP - 2312

JO - Journal of Digital Imaging

JF - Journal of Digital Imaging

IS - 5

ER -

Reproducibility of Deep Learning Algorithms Developed for Medical Imaging Analysis: A Systematic Review

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this