TY - JOUR
T1 - Expert artificial intelligence-based natural language processing characterises childhood asthma
AU - Seol, Hee Yun
AU - Rolfes, Mary C.
AU - Chung, Wi
AU - Sohn, Sunghwan
AU - Ryu, Euijung
AU - Park, Miguel A.
AU - Kita, Hirohito
AU - Ono, Junya
AU - Croghan, Ivana
AU - Armasu, Sebastian M.
AU - Castro-Rodriguez, Jose A.
AU - Weston, Jill D.
AU - Liu, Hongfang
AU - Juhn, Young
N1 - Funding Information:
We would like to thank Mrs Kelly Okeson for her administrative assistance. We would also like to thank Drs Rohit D Divekar, Thanai Pongdee, Bong Seok Choi and Mrs Julie C Porcher for their review and helpful comments. Funding information: National Institute of Health (NIH)-funded R01 grant (R01 HL126667), R21 grants (R21AI116839-01 and R21AI142702) and T. Denny Sanford Paediatric Collaborative Research Fund. The resources of the Rochester Epidemiology Project (R01-AG34676) from the National Institute on Ageing and CTSA Grant Number UL1 TR000135 from the National Centre for Advancing Translational Sciences.
Funding Information:
National Institute of Health (NIH)-funded R01 grant (R01 HL126667) and R21 grant (R21AI116839-01 and R21AI142702), and T. Denny Sanford Pediatric Collaborative Research Fund. The resources of the Rochester Epidemiology Project (R01-AG34676) from the National Institute on Aging and CTSA Grant Number UL1 TR000135 from the National Center for Advancing Translational Sciences.
Funding Information:
National Institute of Health (NIH)-funded R01 grant (R01 HL126667) and R21 grant (R21AI116839-01 and R21AI142702), and T. Denny Sanford Pediatric Collaborative Research Fund. The resources of the Rochester Epidemiology Project (R01-AG34676) from the National Institute on Aging and CTSA Grant Number UL1 TR000135 from the National Center for Advancing Translational Sciences.
Publisher Copyright:
© Author (s).
PY - 2020/2/4
Y1 - 2020/2/4
N2 - Introduction The lack of effective, consistent, reproducible and efficient asthma ascertainment methods results in inconsistent asthma cohorts and study results for clinical trials or other studies. We aimed to assess whether application of expert artificial intelligence (AI)-based natural language processing (NLP) algorithms for two existing asthma criteria to electronic health records of a paediatric population systematically identifies childhood asthma and its subgroups with distinctive characteristics. Methods Using the 1997-2007 Olmsted County Birth Cohort, we applied validated NLP algorithms for Predetermined Asthma Criteria (NLP-PAC) as well as Asthma Predictive Index (NLP-API). We categorised subjects into four groups (both criteria positive (NLP-PAC + /NLP-API +); PAC positive only (NLP-PAC + only); API positive only (NLP-API + only); and both criteria negative (NLP-PAC - /NLP-API -)) and characterised them. Results were replicated in unsupervised cluster analysis for asthmatics and a random sample of 300 children using laboratory and pulmonary function tests (PFTs). Results Of the 8196 subjects (51% male, 80% white), we identified 1614 (20%), NLP-PAC + /NLP-API +; 954 (12%), NLP-PAC + only; 105 (1%), NLP-API + only; and 5523 (67%), NLP-PAC - /NLP-API -. Asthmatic children classified as NLP-PAC + /NLP-API + showed earlier onset asthma, more Th2-high profile, poorer lung function, higher asthma exacerbation and higher risk of asthma-associated comorbidities compared with other groups. These results were consistent with those based on unsupervised cluster analysis and lab and PFT data of a random sample of study subjects. Conclusion Expert AI-based NLP algorithms for two asthma criteria systematically identify childhood asthma with distinctive characteristics. This approach may improve precision, reproducibility, consistency and efficiency of large-scale clinical studies for asthma and enable population management.
AB - Introduction The lack of effective, consistent, reproducible and efficient asthma ascertainment methods results in inconsistent asthma cohorts and study results for clinical trials or other studies. We aimed to assess whether application of expert artificial intelligence (AI)-based natural language processing (NLP) algorithms for two existing asthma criteria to electronic health records of a paediatric population systematically identifies childhood asthma and its subgroups with distinctive characteristics. Methods Using the 1997-2007 Olmsted County Birth Cohort, we applied validated NLP algorithms for Predetermined Asthma Criteria (NLP-PAC) as well as Asthma Predictive Index (NLP-API). We categorised subjects into four groups (both criteria positive (NLP-PAC + /NLP-API +); PAC positive only (NLP-PAC + only); API positive only (NLP-API + only); and both criteria negative (NLP-PAC - /NLP-API -)) and characterised them. Results were replicated in unsupervised cluster analysis for asthmatics and a random sample of 300 children using laboratory and pulmonary function tests (PFTs). Results Of the 8196 subjects (51% male, 80% white), we identified 1614 (20%), NLP-PAC + /NLP-API +; 954 (12%), NLP-PAC + only; 105 (1%), NLP-API + only; and 5523 (67%), NLP-PAC - /NLP-API -. Asthmatic children classified as NLP-PAC + /NLP-API + showed earlier onset asthma, more Th2-high profile, poorer lung function, higher asthma exacerbation and higher risk of asthma-associated comorbidities compared with other groups. These results were consistent with those based on unsupervised cluster analysis and lab and PFT data of a random sample of study subjects. Conclusion Expert AI-based NLP algorithms for two asthma criteria systematically identify childhood asthma with distinctive characteristics. This approach may improve precision, reproducibility, consistency and efficiency of large-scale clinical studies for asthma and enable population management.
KW - asthma
KW - asthma epidemiology
KW - paediatric asthma
UR - http://www.scopus.com/inward/record.url?scp=85078661728&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85078661728&partnerID=8YFLogxK
U2 - 10.1136/bmjresp-2019-000524
DO - 10.1136/bmjresp-2019-000524
M3 - Article
C2 - 33371009
AN - SCOPUS:85078661728
SN - 2052-4439
VL - 7
JO - BMJ Open Respiratory Research
JF - BMJ Open Respiratory Research
IS - 1
M1 - e000524
ER -