Automatic selection of reference taxa for protein–protein interaction prediction with phylogenetic profiling
Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein–protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly avai...
Saved in:
Published in | Bioinformatics (Oxford, England) Vol. 28; no. 6; pp. 851 - 857 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Oxford
Oxford University Press
15.03.2012
|
Subjects | |
Online Access | Get full text |
ISSN | 1367-4803 1367-4811 1367-4811 |
DOI | 10.1093/bioinformatics/btr720 |
Cover
Summary: | Motivation: Phylogenetic profiling methods can achieve good accuracy in predicting protein–protein interactions, especially in prokaryotes. Recent studies have shown that the choice of reference taxa (RT) is critical for accurate prediction, but with more than 2500 fully sequenced taxa publicly available, identifying the most-informative RT is becoming increasingly difficult. Previous studies on the selection of RT have provided guidelines for manual taxon selection, and for eliminating closely related taxa. However, no general strategy for automatic selection of RT is currently available.
Results: We present three novel methods for automating the selection of RT, using machine learning based on known protein–protein interaction networks. One of these methods in particular, Tree-Based Search, yields greatly improved prediction accuracies. We further show that different methods for constituting phylogenetic profiles often require very different RT sets to support high prediction accuracy.
Availability: The datasets and software used in the experiments can be found at http://users-birc.au.dk/zxr/phyloprof/
Contact: zxr@birc.au.dk; somme89@gmail.com
Supplementary information: Supplementary data are available at Bioinformatics online. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1367-4803 1367-4811 1367-4811 |
DOI: | 10.1093/bioinformatics/btr720 |