SimplePPT: A simple principal tree algorithm

Qi Mao, Le Yang, Li Wang, Steve Goodison, Yijun Sun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Scopus citations

Abstract

Many scientific datasets are of high dimension, and the analysis usually requires visual manipulation by retaining the most important structures of data. Principal curve is a widely used approach for this purpose. However, many existing methods work only for data with structures that are not self-intersected, which is quite restrictive for real applications. To address this issue, we develop a new model, which captures the local information of the underlying graph structure based on reversed graph embedding. A generalization bound is derived that show that the model is consistent if the number of data points is sufficiently large. As a special case, a principal tree model is proposed and a new algorithm is developed that learns a tree structure automatically from data. The new algorithm is simple and parameter-free with guaranteed convergence. Experimental results on synthetic and breast cancer datasets show that the proposed method compares favorably with baselines and can discover a breast cancer progression path with multiple branches.

Original languageEnglish (US)
Title of host publicationSIAM International Conference on Data Mining 2015, SDM 2015
EditorsSuresh Venkatasubramanian, Jieping Ye
PublisherSociety for Industrial and Applied Mathematics Publications
Pages792-800
Number of pages9
ISBN (Electronic)9781510811522
DOIs
StatePublished - 2015
EventSIAM International Conference on Data Mining 2015, SDM 2015 - Vancouver, Canada
Duration: Apr 30 2015May 2 2015

Publication series

NameSIAM International Conference on Data Mining 2015, SDM 2015

Other

OtherSIAM International Conference on Data Mining 2015, SDM 2015
Country/TerritoryCanada
CityVancouver
Period4/30/155/2/15

Keywords

  • Cancer progression path
  • Principal curve
  • Principal graph
  • Reversed graph embedding

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint

Dive into the research topics of 'SimplePPT: A simple principal tree algorithm'. Together they form a unique fingerprint.

Cite this