TY - JOUR
T1 - Decoding radiology reports
T2 - Potential application of OpenAI ChatGPT to enhance patient understanding of diagnostic reports
AU - Li, Hanzhou
AU - Moon, John T.
AU - Iyer, Deepak
AU - Balthazar, Patricia
AU - Krupinski, Elizabeth A.
AU - Bercu, Zachary L.
AU - Newsome, Janice M.
AU - Banerjee, Imon
AU - Gichoya, Judy W.
AU - Trivedi, Hari M.
N1 - Publisher Copyright:
© 2023 Elsevier Inc.
PY - 2023/9
Y1 - 2023/9
N2 - Purpose: To evaluate the complexity of diagnostic radiology reports across major imaging modalities and the ability of ChatGPT (Early March 2023 Version, OpenAI, California, USA) to simplify these reports to the 8th grade reading level of the average U.S. adult. Methods: We randomly sampled 100 radiographs (XR), 100 ultrasound (US), 100 CT, and 100 MRI radiology reports from our institution's database dated between 2022 and 2023 (N = 400). These were processed by ChatGPT using the prompt “Explain this radiology report to a patient in layman's terms in second person: <Report Text>”. Mean report length, Flesch reading ease score (FRES), and Flesch-Kincaid reading level (FKRL) were calculated for each report and ChatGPT output. T-tests were used to determine significance. Results: Mean report length was 164 ± 117 words, FRES was 38.0 ± 11.8, and FKRL was 10.4 ± 1.9. FKRL was significantly higher for CT and MRI than for US and XR. Only 60/400 (15%) had a FKRL <8.5. The mean simplified ChatGPT output length was 103 ± 36 words, FRES was 83.5 ± 5.6, and FKRL was 5.8 ± 1.1. This reflects a mean decrease of 61 words (p < 0.01), increase in FRES of 45.5 (p < 0.01), and decrease in FKRL of 4.6 (p < 0.01). All simplified outputs had FKRL <8.5. Discussion: Our study demonstrates the effective use of ChatGPT when tasked with simplifying radiology reports to below the 8th grade reading level. We report significant improvements in FRES, FKRL, and word count, the last of which requires modality-specific context.
AB - Purpose: To evaluate the complexity of diagnostic radiology reports across major imaging modalities and the ability of ChatGPT (Early March 2023 Version, OpenAI, California, USA) to simplify these reports to the 8th grade reading level of the average U.S. adult. Methods: We randomly sampled 100 radiographs (XR), 100 ultrasound (US), 100 CT, and 100 MRI radiology reports from our institution's database dated between 2022 and 2023 (N = 400). These were processed by ChatGPT using the prompt “Explain this radiology report to a patient in layman's terms in second person: <Report Text>”. Mean report length, Flesch reading ease score (FRES), and Flesch-Kincaid reading level (FKRL) were calculated for each report and ChatGPT output. T-tests were used to determine significance. Results: Mean report length was 164 ± 117 words, FRES was 38.0 ± 11.8, and FKRL was 10.4 ± 1.9. FKRL was significantly higher for CT and MRI than for US and XR. Only 60/400 (15%) had a FKRL <8.5. The mean simplified ChatGPT output length was 103 ± 36 words, FRES was 83.5 ± 5.6, and FKRL was 5.8 ± 1.1. This reflects a mean decrease of 61 words (p < 0.01), increase in FRES of 45.5 (p < 0.01), and decrease in FKRL of 4.6 (p < 0.01). All simplified outputs had FKRL <8.5. Discussion: Our study demonstrates the effective use of ChatGPT when tasked with simplifying radiology reports to below the 8th grade reading level. We report significant improvements in FRES, FKRL, and word count, the last of which requires modality-specific context.
KW - 21st century cures act
KW - Large language model
KW - Natural language processing
KW - Patient-centered reports
UR - http://www.scopus.com/inward/record.url?scp=85162156868&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85162156868&partnerID=8YFLogxK
U2 - 10.1016/j.clinimag.2023.06.008
DO - 10.1016/j.clinimag.2023.06.008
M3 - Article
C2 - 37336169
AN - SCOPUS:85162156868
SN - 0899-7071
VL - 101
SP - 137
EP - 141
JO - Clinical Imaging
JF - Clinical Imaging
ER -