FedFSA: Hybrid and federated framework for functional status ascertainment across institutions

Sunyang Fu; Heling Jia; Maria Vassilaki; Vipina K. Keloth; Yifang Dang; Yujia Zhou; Muskan Garg; Ronald C. Petersen; Jennifer St Sauver; Sungrim Moon; Liwei Wang; Andrew Wen; Fang Li; Hua Xu; Cui Tao; Jungwei Fan; Hongfang Liu; Sunghwan Sohn

doi:10.1016/j.jbi.2024.104623

FedFSA: Hybrid and federated framework for functional status ascertainment across institutions

Sunyang Fu, Heling Jia, Maria Vassilaki, Vipina K. Keloth, Yifang Dang, Yujia Zhou, Muskan Garg, Ronald C. Petersen, Jennifer St Sauver, Sungrim Moon, Liwei Wang, Andrew Wen, Fang Li, Hua Xu, Cui Tao, Jungwei Fan, Hongfang Liu, Sunghwan Sohn

Research output: Contribution to journal › Article › peer-review

Abstract

Introduction: Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients’ functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions. Methods: FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs. Results: ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance. Conclusion: NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.

Original language	English (US)
Article number	104623
Journal	Journal of Biomedical Informatics
Volume	152
DOIs	https://doi.org/10.1016/j.jbi.2024.104623
State	Published - Apr 2024

Keywords

Deep learning
Electronic health records
Federated learning
Functional status
Natural language processing

ASJC Scopus subject areas

Health Informatics
Computer Science Applications

Access to Document

10.1016/j.jbi.2024.104623

Cite this

Fu, S., Jia, H., Vassilaki, M., Keloth, V. K., Dang, Y., Zhou, Y., Garg, M., Petersen, R. C., St Sauver, J., Moon, S., Wang, L., Wen, A., Li, F., Xu, H., Tao, C., Fan, J., Liu, H., & Sohn, S. (2024). FedFSA: Hybrid and federated framework for functional status ascertainment across institutions. Journal of Biomedical Informatics, 152, Article 104623. https://doi.org/10.1016/j.jbi.2024.104623

@article{87350d3b0ba04e47813de1668dbce181,

title = "FedFSA: Hybrid and federated framework for functional status ascertainment across institutions",

abstract = "Introduction: Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients{\textquoteright} functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions. Methods: FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs. Results: ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance. Conclusion: NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.",

keywords = "Deep learning, Electronic health records, Federated learning, Functional status, Natural language processing",

author = "Sunyang Fu and Heling Jia and Maria Vassilaki and Keloth, {Vipina K.} and Yifang Dang and Yujia Zhou and Muskan Garg and Petersen, {Ronald C.} and {St Sauver}, Jennifer and Sungrim Moon and Liwei Wang and Andrew Wen and Fang Li and Hua Xu and Cui Tao and Jungwei Fan and Hongfang Liu and Sunghwan Sohn",

note = "Publisher Copyright: {\textcopyright} 2024 Elsevier Inc.",

year = "2024",

month = apr,

doi = "10.1016/j.jbi.2024.104623",

language = "English (US)",

volume = "152",

journal = "Journal of Biomedical Informatics",

issn = "1532-0464",

publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - FedFSA

T2 - Hybrid and federated framework for functional status ascertainment across institutions

AU - Fu, Sunyang

AU - Jia, Heling

AU - Vassilaki, Maria

AU - Keloth, Vipina K.

AU - Dang, Yifang

AU - Zhou, Yujia

AU - Garg, Muskan

AU - Petersen, Ronald C.

AU - St Sauver, Jennifer

AU - Moon, Sungrim

AU - Wang, Liwei

AU - Wen, Andrew

AU - Li, Fang

AU - Xu, Hua

AU - Tao, Cui

AU - Fan, Jungwei

AU - Liu, Hongfang

AU - Sohn, Sunghwan

PY - 2024/4

Y1 - 2024/4

N2 - Introduction: Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients’ functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions. Methods: FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs. Results: ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance. Conclusion: NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.

AB - Introduction: Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients’ functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions. Methods: FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs. Results: ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance. Conclusion: NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.

KW - Deep learning

KW - Electronic health records

KW - Federated learning

KW - Functional status

KW - Natural language processing

UR - http://www.scopus.com/inward/record.url?scp=85187660555&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85187660555&partnerID=8YFLogxK

U2 - 10.1016/j.jbi.2024.104623

DO - 10.1016/j.jbi.2024.104623

M3 - Article

C2 - 38458578

AN - SCOPUS:85187660555

SN - 1532-0464

VL - 152

JO - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

M1 - 104623

ER -

FedFSA: Hybrid and federated framework for functional status ascertainment across institutions

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Cite this