Machine learning helps identify new drug mechanisms in triple-negative breast cancer

Arjun P. Athreya, Alan J. Gaglio, Junmei Cairns, Krishna R. Kalari, Richard M. Weinshilboum, Liewei Wang, Zbigniew T. Kalbarczyk, Ravishankar K. Iyer

Research output: Contribution to journalArticlepeer-review

3 Scopus citations


This paper demonstrates the ability of mach- ine learning approaches to identify a few genes among the 23,398 genes of the human genome to experiment on in the laboratory to establish new drug mechanisms. As a case study, this paper uses MDA-MB-231 breast cancer single-cells treated with the antidiabetic drug metformin. We show that mixture-model-based unsupervised methods with validation from hierarchical clustering can identify single-cell subpopulations (clusters). These clusters are characterized by a small set of genes (1% of the genome) that have significant differential expression across the clusters and are also highly correlated with pathways with anticancer effects driven by metformin. Among the identified small set of genes associated with reduced breast cancer incidence, laboratory experiments on one of the genes, CDC42, showed that its downregulation by metformin inhibited cancer cell migration and proliferation, thus validating the ability of machine learning approaches to identify biologically relevant candidates for laboratory experiments. Given the large size of the human genome and limitations in cost and skilled resources, the broader impact of this work in identifying a small set of differentially expressed genes after drug treatment lies in augmenting the drug-disease knowledge of pharmacogenomics experts in laboratory investigations, which could help establish novel biological mechanisms associated with drug response in diseases beyond breast cancer.

Original languageEnglish (US)
Article number8401331
Pages (from-to)251-259
Number of pages9
JournalIEEE Transactions on Nanobioscience
Issue number3
StatePublished - Jul 2018


  • Breast Cancer
  • Metformin
  • Mixture-Models
  • Model-Based Learning
  • Single-Cell RNASeq
  • Unsupervised Learning

ASJC Scopus subject areas

  • Bioengineering
  • Electrical and Electronic Engineering
  • Biotechnology
  • Biomedical Engineering
  • Medicine (miscellaneous)
  • Computer Science Applications
  • Pharmaceutical Science


Dive into the research topics of 'Machine learning helps identify new drug mechanisms in triple-negative breast cancer'. Together they form a unique fingerprint.

Cite this