TY - JOUR
T1 - Knowledge-Infused Global-Local Data Fusion for Spatial Predictive Modeling in Precision Medicine
AU - Wang, Lujia
AU - Hawkins-Daarud, Andrea
AU - Swanson, Kristin R.
AU - Hu, Leland S.
AU - Li, Jing
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2022/7/1
Y1 - 2022/7/1
N2 - The automated capability of generating spatial prediction for a variable of interest is desirable in various science and engineering domains. Take precision medicine of cancer as an example, in which the goal is to match patients with treatments based on molecular markers identified in each patient's tumor. A substantial challenge, however, is that the molecular markers can vary significantly at different spatial locations of a tumor. If this spatial distribution could be predicted, the precision of cancer treatment could be greatly improved by adapting treatment to the spatial molecular heterogeneity. This is a challenging task because no technology is available to measure the molecular markers at each spatial location within a tumor. Biopsy samples provide direct measurement, but they are scarce/local. Imaging, such as MRI, is global, but it only provides proxy/indirect measurement. Also available are mechanistic models or domain knowledge, which are often approximate or incomplete. This article proposes a novel machine learning framework to fuse the three sources of data/information to generate a spatial prediction, namely, the knowledge-infused global-local (KGL) data fusion model. A novel mathematical formulation is proposed and solved with theoretical study. We present a real-data application of predicting the spatial distribution of tumor cell density (TCD) - an important molecular marker for brain cancer. A total of 82 biopsy samples were acquired from 18 patients with glioblastoma, together with six MRI contrast images from each patient and biological knowledge encoded by a PDE simulator-based mechanistic model called proliferation-invasion (PI). KGL achieved the highest prediction accuracy and minimum prediction uncertainty compared with a variety of competing methods. The result has important implications for providing individualized, spatially optimized treatment for each patient. Note to Practitioners - This article proposes a machine learning framework to fuse local data, global imaging, and domain knowledge to generate a spatial prediction for a variable of interest. This methodology is relevant to multiple application domains. In precision medicine, it will allow for mapping the spatial distribution of important, treatment-informing molecular markers across each tumor by integrating biopsy data, MRI, and biological knowledge. This capability can help resolve the spatial heterogeneity of molecular characteristics and greatly improve the precision of cancer treatment. Other applications include early detection of regional fire risk across a forest by integrating ground/aerial survey data, satellite imagery, and fire simulator output, as well as regional poverty estimation for resource allocation.
AB - The automated capability of generating spatial prediction for a variable of interest is desirable in various science and engineering domains. Take precision medicine of cancer as an example, in which the goal is to match patients with treatments based on molecular markers identified in each patient's tumor. A substantial challenge, however, is that the molecular markers can vary significantly at different spatial locations of a tumor. If this spatial distribution could be predicted, the precision of cancer treatment could be greatly improved by adapting treatment to the spatial molecular heterogeneity. This is a challenging task because no technology is available to measure the molecular markers at each spatial location within a tumor. Biopsy samples provide direct measurement, but they are scarce/local. Imaging, such as MRI, is global, but it only provides proxy/indirect measurement. Also available are mechanistic models or domain knowledge, which are often approximate or incomplete. This article proposes a novel machine learning framework to fuse the three sources of data/information to generate a spatial prediction, namely, the knowledge-infused global-local (KGL) data fusion model. A novel mathematical formulation is proposed and solved with theoretical study. We present a real-data application of predicting the spatial distribution of tumor cell density (TCD) - an important molecular marker for brain cancer. A total of 82 biopsy samples were acquired from 18 patients with glioblastoma, together with six MRI contrast images from each patient and biological knowledge encoded by a PDE simulator-based mechanistic model called proliferation-invasion (PI). KGL achieved the highest prediction accuracy and minimum prediction uncertainty compared with a variety of competing methods. The result has important implications for providing individualized, spatially optimized treatment for each patient. Note to Practitioners - This article proposes a machine learning framework to fuse local data, global imaging, and domain knowledge to generate a spatial prediction for a variable of interest. This methodology is relevant to multiple application domains. In precision medicine, it will allow for mapping the spatial distribution of important, treatment-informing molecular markers across each tumor by integrating biopsy data, MRI, and biological knowledge. This capability can help resolve the spatial heterogeneity of molecular characteristics and greatly improve the precision of cancer treatment. Other applications include early detection of regional fire risk across a forest by integrating ground/aerial survey data, satellite imagery, and fire simulator output, as well as regional poverty estimation for resource allocation.
KW - Health care
KW - machine learning
KW - precision medicine
KW - statistical modeling
UR - http://www.scopus.com/inward/record.url?scp=85105861456&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85105861456&partnerID=8YFLogxK
U2 - 10.1109/TASE.2021.3076117
DO - 10.1109/TASE.2021.3076117
M3 - Article
AN - SCOPUS:85105861456
SN - 1545-5955
VL - 19
SP - 2203
EP - 2215
JO - IEEE Transactions on Automation Science and Engineering
JF - IEEE Transactions on Automation Science and Engineering
IS - 3
ER -