TY - JOUR
T1 - Quantitative CT and machine learning classification of fibrotic interstitial lung diseases
AU - Koo, Chi Wan
AU - Williams, James M.
AU - Liu, Grace
AU - Panda, Ananya
AU - Patel, Parth P.
AU - Frota Lima, Livia Maria M.
AU - Karwoski, Ronald A.
AU - Moua, Teng
AU - Larson, Nicholas B.
AU - Bratt, Alex
N1 - Funding Information:
We would like to acknowledge Dr. Sonia Watson for editorial assistance.
Publisher Copyright:
© 2022, The Author(s), under exclusive licence to European Society of Radiology.
PY - 2022/12
Y1 - 2022/12
N2 - Objectives: To evaluate quantitative computed tomography (QCT) features and QCT feature-based machine learning (ML) models in classifying interstitial lung diseases (ILDs). To compare QCT-ML and deep learning (DL) models’ performance. Methods: We retrospectively identified 1085 patients with pathologically proven usual interstitial pneumonitis (UIP), nonspecific interstitial pneumonitis (NSIP), and chronic hypersensitivity pneumonitis (CHP) who underwent peri-biopsy chest CT. Kruskal-Wallis test evaluated QCT feature associations with each ILD. QCT features, patient demographics, and pulmonary function test (PFT) results trained eXtreme Gradient Boosting (training/validation set n = 911) yielding 3 models: M1 = QCT features only; M2 = M1 plus age and sex; M3 = M2 plus PFT results. A DL model was also developed. ML and DL model areas under the receiver operating characteristic curve (AUC) and 95% confidence intervals (CIs) were compared for multiclass (UIP vs. NSIP vs. CHP) and binary (UIP vs. non-UIP) classification performances. Results: The majority (69/78 [88%]) of QCT features successfully differentiated the 3 ILDs (adjusted p ≤ 0.05). All QCT-ML models achieved higher AUC than the DL model (multiclass AUC micro-averages 0.910, 0.910, 0.925, and 0.798 and macro-averages 0.895, 0.893, 0.925, and 0.779 for M1, M2, M3, and DL respectively; binary AUC 0.880, 0.899, 0.898, and 0.869 for M1, M2, M3, and DL respectively). M3 demonstrated statistically significant better performance compared to M2 (∆AUC: 0.015, CI: [0.002, 0.029]) for multiclass prediction. Conclusions: QCT features successfully differentiated pathologically proven UIP, NSIP, and CHP. While QCT-based ML models outperformed a DL model for classifying ILDs, further investigations are warranted to determine if QCT-ML, DL, or a combination will be superior in ILD classification. Key Points: • Quantitative CT features successfully differentiated pathologically proven UIP, NSIP, and CHP. • Our quantitative CT-based machine learning models demonstrated high performance in classifying UIP, NSIP, and CHP histopathology, outperforming a deep learning model. • While our quantitative CT-based machine learning models performed better than a DL model, additional investigations are needed to determine whether either or a combination of both approaches delivers superior diagnostic performance.
AB - Objectives: To evaluate quantitative computed tomography (QCT) features and QCT feature-based machine learning (ML) models in classifying interstitial lung diseases (ILDs). To compare QCT-ML and deep learning (DL) models’ performance. Methods: We retrospectively identified 1085 patients with pathologically proven usual interstitial pneumonitis (UIP), nonspecific interstitial pneumonitis (NSIP), and chronic hypersensitivity pneumonitis (CHP) who underwent peri-biopsy chest CT. Kruskal-Wallis test evaluated QCT feature associations with each ILD. QCT features, patient demographics, and pulmonary function test (PFT) results trained eXtreme Gradient Boosting (training/validation set n = 911) yielding 3 models: M1 = QCT features only; M2 = M1 plus age and sex; M3 = M2 plus PFT results. A DL model was also developed. ML and DL model areas under the receiver operating characteristic curve (AUC) and 95% confidence intervals (CIs) were compared for multiclass (UIP vs. NSIP vs. CHP) and binary (UIP vs. non-UIP) classification performances. Results: The majority (69/78 [88%]) of QCT features successfully differentiated the 3 ILDs (adjusted p ≤ 0.05). All QCT-ML models achieved higher AUC than the DL model (multiclass AUC micro-averages 0.910, 0.910, 0.925, and 0.798 and macro-averages 0.895, 0.893, 0.925, and 0.779 for M1, M2, M3, and DL respectively; binary AUC 0.880, 0.899, 0.898, and 0.869 for M1, M2, M3, and DL respectively). M3 demonstrated statistically significant better performance compared to M2 (∆AUC: 0.015, CI: [0.002, 0.029]) for multiclass prediction. Conclusions: QCT features successfully differentiated pathologically proven UIP, NSIP, and CHP. While QCT-based ML models outperformed a DL model for classifying ILDs, further investigations are warranted to determine if QCT-ML, DL, or a combination will be superior in ILD classification. Key Points: • Quantitative CT features successfully differentiated pathologically proven UIP, NSIP, and CHP. • Our quantitative CT-based machine learning models demonstrated high performance in classifying UIP, NSIP, and CHP histopathology, outperforming a deep learning model. • While our quantitative CT-based machine learning models performed better than a DL model, additional investigations are needed to determine whether either or a combination of both approaches delivers superior diagnostic performance.
KW - Chronic hypersensitivity pneumonitis
KW - Interstitial lung disease
KW - Machine learning
KW - Nonspecific interstitial pneumonitis
KW - Usual interstitial pneumonitis
UR - http://www.scopus.com/inward/record.url?scp=85131583232&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85131583232&partnerID=8YFLogxK
U2 - 10.1007/s00330-022-08875-4
DO - 10.1007/s00330-022-08875-4
M3 - Article
C2 - 35678861
AN - SCOPUS:85131583232
SN - 0938-7994
VL - 32
SP - 8152
EP - 8161
JO - European radiology
JF - European radiology
IS - 12
ER -