Prioritization of potential candidate disease genes by topological similarity of protein–protein interaction network and phenotype data

[Display omitted] •We construct a reliable heterogeneous network by fusing multiple networks.•We devise a random walk based algorithm on the reliable heterogeneous network.•Combining topological similarity with phenotype data helps to predict causal genes.•The algorithm is still in good performance...

Full description

Saved in:

Bibliographic Details
Published in	Journal of biomedical informatics Vol. 53; pp. 229 - 236
Main Authors	Luo, Jiawei, Liang, Shiyu
Format	Journal Article
Language	English
Published	United States Elsevier Inc 01.02.2015
Subjects	Algorithms Breast Neoplasms - genetics Cancer Computational Biology - methods Diabetes Mellitus, Type 2 - genetics Disease genes Female Genes Genetic Predisposition to Disease Humans Male Models, Statistical Networks Phenotype Prostate Prostatic Neoplasms - genetics Protein Interaction Mapping - methods Protein Interaction Maps Protein–protein interaction networks Random walk Similarity Software Tasks Topological similarity Topology Phenotype Disease genes Protein–protein interaction networks Random walk Topological similarity
Online Access	Get full text

Cover

Loading…

More Information
Summary:	[Display omitted] •We construct a reliable heterogeneous network by fusing multiple networks.•We devise a random walk based algorithm on the reliable heterogeneous network.•Combining topological similarity with phenotype data helps to predict causal genes.•The algorithm is still in good performance at low parameter values. Identifying candidate disease genes is important to improve medical care. However, this task is challenging in the post-genomic era. Several computational approaches have been proposed to prioritize potential candidate genes relying on protein–protein interaction (PPI) networks. However, the experimental PPI network is usually liable to contain a number of spurious interactions. In this paper, we construct a reliable heterogeneous network by fusing multiple networks, a PPI network reconstructed by topological similarity, a phenotype similarity network and known associations between diseases and genes. We then devise a random walk-based algorithm on the reliable heterogeneous network called RWRHN to prioritize potential candidate genes for inherited diseases. The results of leave-one-out cross-validation experiments show that the RWRHN algorithm has better performance than the RWRH and CIPHER methods in inferring disease genes. Furthermore, RWRHN is used to predict novel causal genes for 16 diseases, including breast cancer, diabetes mellitus type 2, and prostate cancer, as well as to detect disease-related protein complexes. The top predictions are supported by literature evidence.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1532-0464 1532-0480
DOI:	10.1016/j.jbi.2014.11.004