Big Data Approaches to Phenotyping Acute Ischemic Stroke Using Automated Lesion Segmentation of Multi-Center Magnetic Resonance Imaging Data

Ona Wu, Stefan Winzeck, Anne Katrin Giese, Brandon L. Hancock, Mark R. Etherton, Mark J.R.J. Bouts, Kathleen Donahue, Markus D. Schirmer, Robert E. Irie, Steven J.T. Mocking, Elissa C. McIntosh, Raquel Bezerra, Konstantinos Kamnitsas, Petrea Frid, Johan Wasselius, John W. Cole, Huichun Xu, Lukas Holmegaard, Jordi Jiménez-Conde, Robin LemmensEric Lorentzen, Patrick F. McArdle, James F. Meschia, Jaume Roquer, Tatjana Rundek, Ralph L. Sacco, Reinhold Schmidt, Pankaj Sharma, Agnieszka Slowik, Tara M. Stanne, Vincent Thijs, Achala Vagal, Daniel Woo, Stephen Bevan, Steven J. Kittner, Braxton D. Mitchell, Jonathan Rosand, Bradford B. Worrall, Christina Jern, Arne G. Lindgren, Jane Maguire, Natalia S. Rost

Research output: Contribution to journalArticlepeer-review

10 Scopus citations


Background and Purpose-We evaluated deep learning algorithms' segmentation of acute ischemic lesions on heterogeneous multi-center clinical diffusion-weighted magnetic resonance imaging (MRI) data sets and explored the potential role of this tool for phenotyping acute ischemic stroke. Methods-Ischemic stroke data sets from the MRI-GENIE (MRI-Genetics Interface Exploration) repository consisting of 12 international genetic research centers were retrospectively analyzed using an automated deep learning segmentation algorithm consisting of an ensemble of 3-dimensional convolutional neural networks. Three ensembles were trained using data from the following: (1) 267 patients from an independent single-center cohort, (2) 267 patients from MRI-GENIE, and (3) mixture of (1) and (2). The algorithms' performances were compared against manual outlines from a separate 383 patient subset from MRI-GENIE. Univariable and multivariable logistic regression with respect to demographics, stroke subtypes, and vascular risk factors were performed to identify phenotypes associated with large acute diffusion-weighted MRI volumes and greater stroke severity in 2770 MRI-GENIE patients. Stroke topography was investigated. Results-The ensemble consisting of a mixture of MRI-GENIE and single-center convolutional neural networks performed best. Subset analysis comparing automated and manual lesion volumes in 383 patients found excellent correlation (ρ=0.92; P<0.0001). Median (interquartile range) diffusion-weighted MRI lesion volumes from 2770 patients were 3.7 cm3 (0.9-16.6 cm3). Patients with small artery occlusion stroke subtype had smaller lesion volumes (P<0.0001) and different topography compared with other stroke subtypes. Conclusions-Automated accurate clinical diffusion-weighted MRI lesion segmentation using deep learning algorithms trained with multi-center and diverse data is feasible. Both lesion volume and topography can provide insight into stroke subtypes with sufficient sample size from big heterogeneous multi-center clinical imaging phenotype data sets.

Original languageEnglish (US)
Pages (from-to)1734-1741
Number of pages8
Issue number7
StatePublished - Jul 1 2019


  • diffusion magnetic resonance imaging
  • machine learning
  • phenotype
  • risk factors
  • stroke

ASJC Scopus subject areas

  • Clinical Neurology
  • Cardiology and Cardiovascular Medicine
  • Advanced and Specialized Nursing


Dive into the research topics of 'Big Data Approaches to Phenotyping Acute Ischemic Stroke Using Automated Lesion Segmentation of Multi-Center Magnetic Resonance Imaging Data'. Together they form a unique fingerprint.

Cite this