Introduction: Accurate subtyping of NSCLC into lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) is the cornerstone of NSCLC diagnosis. Cytology samples reveal higher rates of classification failures, that is, subtyping as non–small cell carcinoma—not otherwise specified (NSCC-NOS), as compared with histology specimens. This study aims to identify specific algorithms on the basis of known cytomorphologic features that aid accurate and successful subtyping of NSCLC on cytology. Methods: A total of 13 expert cytopathologists participated anonymously in an online survey to subtype 119 NSCLC cytology cases (gold standard diagnoses being LUAD in 80 and LUSC in 39) enriched for nonkeratinizing LUSC. They selected from 23 predefined cytomorphologic features that they used in subtyping. Data were analyzed using machine learning algorithms on the basis of random forest method and regression trees. Results: From 1474 responses recorded, concordant cytology typing was achieved in 53.7% (792 of 1474) responses. NSCC-NOS rates on cytology were similar among gold standard LUAD (36%) and LUSC (38%) cases. Misclassification rates were higher in gold standard LUSC (17.6%) than gold standard LUAD (5.5%; p < 0.0001). Keratinization, when present, recognized LUSC with high accuracy. In its absence, the machine learning algorithms developed on the basis of experts’ choices were unable to reduce cytology NSCC-NOS rates without increasing misclassification rates. Conclusions: Suboptimal recognition of LUSC in the absence of keratinization remains the major hurdle in improving cytology subtyping accuracy with such cases either failing classification (NSCC-NOS) or misclassifying as LUAD. NSCC-NOS seems to be an inevitable morphologic diagnosis emphasizing that ancillary immunochemistry is necessary to achieve accurate subtyping on cytology.
- Machine learning
- Non–small cell lung carcinoma
- Regression tree
ASJC Scopus subject areas
- Pulmonary and Respiratory Medicine