Statistical analysis and prediction of protein-protein interfaces

Andrew J. Bordner, Ruben Abagyan

Research output: Contribution to journalArticlepeer-review

137 Scopus citations


Predicting protein-protein interfaces from a three-dimensional structure is a key task of computational structural proteomics. In contrast to geometrically distinct small molecule binding sites, protein-protein interface are notoriously difficult to predict. We generated a large nonredundant data set of 1494 true protein-protein interfaces using biological symmetry annotation where necessary. The data set was carefully analyzed and a Support Vector Machine was trained on a combination of a new robust evolutionary conservation signal with the local surface properties to predict protein-protein interfaces. Fivefold cross validation verifies the high sensitivity and selectivity of the model. As much as 97% of the predicted patches had an overlap with the true interface patch while only 22% of the surface residues were included in an average predicted patch. The model allowed the identification of potential new interfaces and the correction of mislabeled oligomeric states.

Original languageEnglish (US)
Pages (from-to)353-366
Number of pages14
JournalProteins: Structure, Function and Genetics
Issue number3
StatePublished - Aug 15 2005


  • Binding sites
  • Dimerization
  • Evolutionary conservation
  • Protein interactions
  • Protein surface annotation
  • Statistical tests
  • Support Vector Machines

ASJC Scopus subject areas

  • Structural Biology
  • Biochemistry
  • Molecular Biology


Dive into the research topics of 'Statistical analysis and prediction of protein-protein interfaces'. Together they form a unique fingerprint.

Cite this