Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data

Runpu Chen, Le Yang, Steve Goodison, Yijun Sun

Research output: Contribution to journalArticlepeer-review

13 Scopus citations


Motivation: Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes. Results: To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data.

Original languageEnglish (US)
Pages (from-to)1476-1483
Number of pages8
Issue number5
StatePublished - Mar 1 2020

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Computational Mathematics


Dive into the research topics of 'Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data'. Together they form a unique fingerprint.

Cite this