Knowledge-Infused Global-Local Data Fusion for Spatial Predictive Modeling in Precision Medicine

Lujia Wang, Andrea Hawkins-Daarud, Kristin R. Swanson, Leland S. Hu, Jing Li

Research output: Contribution to journalArticlepeer-review


The automated capability of generating spatial prediction for a variable of interest is desirable in various science and engineering domains. Take precision medicine of cancer as an example, in which the goal is to match patients with treatments based on molecular markers identified in each patient's tumor. A substantial challenge, however, is that the molecular markers can vary significantly at different spatial locations of a tumor. If this spatial distribution could be predicted, the precision of cancer treatment could be greatly improved by adapting treatment to the spatial molecular heterogeneity. This is a challenging task because no technology is available to measure the molecular markers at each spatial location within a tumor. Biopsy samples provide direct measurement, but they are scarce/local. Imaging, such as MRI, is global, but it only provides proxy/indirect measurement. Also available are mechanistic models or domain knowledge, which are often approximate or incomplete. This article proposes a novel machine learning framework to fuse the three sources of data/information to generate a spatial prediction, namely, the knowledge-infused global-local (KGL) data fusion model. A novel mathematical formulation is proposed and solved with theoretical study. We present a real-data application of predicting the spatial distribution of tumor cell density (TCD) - an important molecular marker for brain cancer. A total of 82 biopsy samples were acquired from 18 patients with glioblastoma, together with six MRI contrast images from each patient and biological knowledge encoded by a PDE simulator-based mechanistic model called proliferation-invasion (PI). KGL achieved the highest prediction accuracy and minimum prediction uncertainty compared with a variety of competing methods. The result has important implications for providing individualized, spatially optimized treatment for each patient. Note to Practitioners - This article proposes a machine learning framework to fuse local data, global imaging, and domain knowledge to generate a spatial prediction for a variable of interest. This methodology is relevant to multiple application domains. In precision medicine, it will allow for mapping the spatial distribution of important, treatment-informing molecular markers across each tumor by integrating biopsy data, MRI, and biological knowledge. This capability can help resolve the spatial heterogeneity of molecular characteristics and greatly improve the precision of cancer treatment. Other applications include early detection of regional fire risk across a forest by integrating ground/aerial survey data, satellite imagery, and fire simulator output, as well as regional poverty estimation for resource allocation.

Original languageEnglish (US)
Pages (from-to)2203-2215
Number of pages13
JournalIEEE Transactions on Automation Science and Engineering
Issue number3
StatePublished - Jul 1 2022


  • Health care
  • machine learning
  • precision medicine
  • statistical modeling

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering


Dive into the research topics of 'Knowledge-Infused Global-Local Data Fusion for Spatial Predictive Modeling in Precision Medicine'. Together they form a unique fingerprint.

Cite this