Assessing and Predicting Protein Interactions Using Both Local and Global Network Topological Metrics

High-throughput protein interaction data, with ever-increasing volume, are becoming the foundation of many biological discoveries. However, high-throughput protein interaction data are often associated with high false positive and false negative rates. It is desirable to develop scalable methods to...

Full description

Saved in:
Bibliographic Details
Published inGenome Informatics Vol. 21; pp. 138 - 149
Main Authors Li, Jinyan, Wong, Limsoon, Liu, Guimei
Format Journal Article
LanguageEnglish
Published Japan Japanese Society for Bioinformatics 2008
Subjects
Online AccessGet full text
ISSN0919-9454
2185-842X
DOI10.11234/gi1990.21.138

Cover

Abstract High-throughput protein interaction data, with ever-increasing volume, are becoming the foundation of many biological discoveries. However, high-throughput protein interaction data are often associated with high false positive and false negative rates. It is desirable to develop scalable methods to identify these errors. In this paper, we develop a computational method to identify spurious interactions and missing interactions from high-throughput protein interaction data. Our method uses both local and global topological information of protein pairs, and it assigns a local interacting score and a global interacting score to every protein pair. The local interacting score is calculated based on the common neighbors of the protein pairs. The global interacting score is computed using globally interacting protein group pairs. The two scores are then combined to obtain a final score called LGTweight to indicate the interacting possibility of two proteins. We tested our method on the DIP yeast interaction dataset. The experimental results show that the interactions ranked top by our method have higher functional homogeneity and localization coherence than existing methods, and our method also achieves higher sensitivity and precision under 5-fold cross validation than existing methods.
AbstractList High-throughput protein interaction data, with ever-increasing volume, are becoming the foundation of many biological discoveries. However, high-throughput protein interaction data are often associated with high false positive and false negative rates. It is desirable to develop scalable methods to identify these errors. In this paper, we develop a computational method to identify spurious interactions and missing interactions from high-throughput protein interaction data. Our method uses both local and global topological information of protein pairs, and it assigns a local interacting score and a global interacting score to every protein pair. The local interacting score is calculated based on the common neighbors of the protein pairs. The global interacting score is computed using globally interacting protein group pairs. The two scores are then combined to obtain a final score called LGTweight to indicate the interacting possibility of two proteins. We tested our method on the DIP yeast interaction dataset. The experimental results show that the interactions ranked top by our method have higher functional homogeneity and localization coherence than existing methods, and our method also achieves higher sensitivity and precision under 5-fold cross validation than existing methods.
Author Li, Jinyan
Liu, Guimei
Wong, Limsoon
Author_xml – sequence: 1
  fullname: Li, Jinyan
  organization: School of Computer Engineering, Nanyang Technological University
– sequence: 1
  fullname: Wong, Limsoon
  organization: School of Computing, National University of Singapore
– sequence: 1
  fullname: Liu, Guimei
  organization: School of Computing, National University of Singapore
BackLink https://www.ncbi.nlm.nih.gov/pubmed/19425154$$D View this record in MEDLINE/PubMed
BookMark eNo9kMtPAjEQxhuDEVCvHs2evC32ubs9AlE0wccBEm-b0p2F4tJiW2L8710Buczr-81k8vVRxzoLCN0QPCCEMn6_NERKPKBkQFhxhnqUFCItOP3ooB6WRKaSC95F_RDWGLNMivwCdYnkVBDBewiGIUAIxi4TZavk3UNldPxr372LYGzybCN41c6cDcl8T45cXCVTp1WzX5o0btGWrxC_nf9MZm7rGrc0f_ILRG90uELntWoCXB_zJZo_PszGT-n0bfI8Hk7TNaMkpoRzDYv2N4K5KCpKFLBMqDyrScXqvJAyZ5RyhkXBMpUxKiutoaprUUgmtGKX6O5wd-vd1w5CLDcmaGgaZcHtQpllOZYZ5i14ewR3iw1U5dabjfI_5b8xLTA6AOsQ1RJOgPLR6AbKg-0lJeUxtO6fRL1SvgTLfgFpsX5L
ContentType Journal Article
Copyright Japanese Society for Bioinformatics
Copyright_xml – notice: Japanese Society for Bioinformatics
DBID CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.11234/gi1990.21.138
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList
MEDLINE
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 2185-842X
EndPage 149
ExternalDocumentID 19425154
article_gi1990_21_0_21_0_138_article_char_en
Genre Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID 53G
ADBBV
ALMA_UNASSIGNED_HOLDINGS
BAWUL
DIK
FRP
JSF
JSH
KQ8
OK1
RJT
RZJ
W2D
CGR
CUY
CVF
ECM
EIF
NPM
7X8
ID FETCH-LOGICAL-j321t-144ceb94210458d21ae365a76f1d3f789973224305836a6329dccedff58935ca3
ISSN 0919-9454
IngestDate Fri Jul 11 06:42:10 EDT 2025
Thu Apr 03 07:01:06 EDT 2025
Wed Sep 03 06:03:13 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-j321t-144ceb94210458d21ae365a76f1d3f789973224305836a6329dccedff58935ca3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://www.jstage.jst.go.jp/article/gi1990/21/0/21_0_138/_article/-char/en
PMID 19425154
PQID 66709604
PQPubID 23479
PageCount 12
ParticipantIDs proquest_miscellaneous_66709604
pubmed_primary_19425154
jstage_primary_article_gi1990_21_0_21_0_138_article_char_en
PublicationCentury 2000
PublicationDate 2008
2008-00-00
20080101
PublicationDateYYYYMMDD 2008-01-01
PublicationDate_xml – year: 2008
  text: 2008
PublicationDecade 2000
PublicationPlace Japan
PublicationPlace_xml – name: Japan
PublicationTitle Genome Informatics
PublicationTitleAlternate GI
PublicationYear 2008
Publisher Japanese Society for Bioinformatics
Publisher_xml – name: Japanese Society for Bioinformatics
References [3] Chen J, Chua HN, Hsu W, Lee ML, Ng SK, Saito R, Sung WK, and Wong L, Increasing confidence of protein-protein inteactomes. In Proc. of 17th International Conference on Genome Informatics, pp.284-297, 2006.
[1] Bock JR, and Gough DA, Predicting protein-protein interactions from primary structure. Bioinformatics, 17 (5): 455-460, 2001.
[6] Chua HN., Sung WK., and Wong L., Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics, 22 (13): 1623-30, 2006.
[20] Ng SK, Zhang Z, and Tan SH, Integrative approach for computationally inferring protein domain interactions. Bioinformatics, 19 (8): 923-929, 2003.
[8] Dandekar, T., Snel, B., Huynen, M., and Bork, P., Conservation of gene order: a fingerprint of proteins that physically interact. Trends in Biochemical Sciences, 23 (9): 324-8, 1998.
[13] Han D, Kim HS, Seo J, and Jang W, A domain combination based probabilistic framework for protein-protein interaction prediction. Genome Informatics Series: Workshop on Genome Informatics, 14: 250-259, 2003.
[10] Edwards AM, Kus B, Jansen R, Greenbaum D, Greenblatt J, and Gerstein M, Bridging structural biology and genomics: assessing protein interaction data with known complexes. Trends in Genetics, 18 (10): 529-536, 2002.
[15] Legrain P, Wojcik J, and Gauthier JM, Protein-protein interaction maps: a lead towards cellular functions. Trends in genetics, 17 (6): 346-352, 2001.
[7] Chua, HN., Sung, WK., and Wong L., An efficient strategy for extensive integration of diverse biological data for protein function prediction. Bioinformatics, 3 (24): 3364-3373, 2007.
[22] Pazos F, and Valencia A., Similarity of phylogenetic trees as indicator of proteinprotein interaction. Protein Engineering, 14 (9): 609-14, 2001.
[21] Oliver S, Proteomics: guilt-by-association goes global. Nature, 403: 601-603, 2000.
[9] Deane CM., Salwinski L., Xenarios I., and Eisenberg D., Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics., 1 (5): 349-56, 2002.
[12] Gomez SM, and Rzhetsky A, Towards the prediction of complete protein-protein interaction networks. In Pacific Symposium on Biocomputing, pp.413-424, 2002.
[18] Marcotte EM., Pellegrini M., Ng HL., Rice DW., Yeates TO., and Eisenberg D., Detecting protein function and protein-protein interactions from genome sequences. Science, 285 (5428): 751-3m, 1999.
[19] Morrison JL, Breitling R, Higham DJ, and Gilbert DR, A lock-and-key model for protein-protein interactions. Bioinformatics, 22 (16): 2012-2019, 2006.
[16] Li H, Li J, and Wong L, Discovery motif pairs at interaction sites from protein sequences on a proteome-wide scale. Bioinformatics, 22 (8): 989-996, 2006.
[17] Liu G., Lu H., Lou W., Xu Y., Yu X. J., Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree. Data Mining and Knowledge Discovery, 9 (3): 249-274, 2004.
[4] Chen J, Hsu W, Lee ML, and Ng SK, Discovering reliable protein interactions from high-throughput experimental data using network topology. Artificial Intelligence in Medicine, 35 (1-2): 37-47, 2005.
[27] Tan SH., Hugo W., Sung WK., and Ng SK., A correlated motif approach for finding short linear motifs from protein interaction networks. BMC Bioinformatics, 7: 502, 2006.
[5] Chen J, Hsu W, Lee ML, and Ng SK, Increasing confidence of protein interactomes using network topological metrics. Bioinformatics, 22 (16): 1998-2004, 2006.
[28] von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, and Bork P, Comparative assessment of large-scale data sets of protein-protein interactions. Nature, 417: 399-403, 2002.
[29] Yu H, Paccanaro A, Trifonov V, and Gerstein M, Predicting interactions in protein networks by completing defective cliques. Bioinformatics, 22 (7): 823-829, 2006.
[23] Ramani AK., Bunescu RC., Mooney RJ., and Marcotte EM., Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome. Genome Biology, 6 (5): R40, 2005.
[11] Goh CS, Bogan AA, Joachimiak M, Walther D, and Cohen FE., Co-evolution of proteins with their interaction partners. Journal of Molecular Biology, 299 (2): 283-93, 2000.
[14] Kim WK, Park J, and Suh JK, Large scale statistical prediction of protein-protein interaction by potentially interacting domain (pid) pair. Genome Informatics Series: Workshop on Genome Informatics, 13: 42-50, 2002.
[24] Saito R, Suzuki H, and Hayashizaki Y, Interaction generality, a measurement to assess the reliability of a protein-protein interaction. Nucleic Acids Research, 30 (5): 1163-1168, 2002.
[26] Sprinzak E, Sattath S, and Margalit H, How reliable are experimental protein-protein interaction data? Journal of Molecular Biology, 327 (5): 919-923, 2003.
[25] Saito R, Suzuki H, and Hayashizaki Y, Construction of reliable protein-protein interaction networks with a new interaction generality measure. Bioinformatics, 19 (6): 756-763, 2002.
[2] Brun C, Chevenet F, Martin D, Wojcik J, Guenoche A, and Jacq B, Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network. Genome Biology, 5 (1): R6, 2003.
References_xml – reference: [4] Chen J, Hsu W, Lee ML, and Ng SK, Discovering reliable protein interactions from high-throughput experimental data using network topology. Artificial Intelligence in Medicine, 35 (1-2): 37-47, 2005.
– reference: [7] Chua, HN., Sung, WK., and Wong L., An efficient strategy for extensive integration of diverse biological data for protein function prediction. Bioinformatics, 3 (24): 3364-3373, 2007.
– reference: [19] Morrison JL, Breitling R, Higham DJ, and Gilbert DR, A lock-and-key model for protein-protein interactions. Bioinformatics, 22 (16): 2012-2019, 2006.
– reference: [22] Pazos F, and Valencia A., Similarity of phylogenetic trees as indicator of proteinprotein interaction. Protein Engineering, 14 (9): 609-14, 2001.
– reference: [18] Marcotte EM., Pellegrini M., Ng HL., Rice DW., Yeates TO., and Eisenberg D., Detecting protein function and protein-protein interactions from genome sequences. Science, 285 (5428): 751-3m, 1999.
– reference: [6] Chua HN., Sung WK., and Wong L., Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics, 22 (13): 1623-30, 2006.
– reference: [14] Kim WK, Park J, and Suh JK, Large scale statistical prediction of protein-protein interaction by potentially interacting domain (pid) pair. Genome Informatics Series: Workshop on Genome Informatics, 13: 42-50, 2002.
– reference: [28] von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, and Bork P, Comparative assessment of large-scale data sets of protein-protein interactions. Nature, 417: 399-403, 2002.
– reference: [9] Deane CM., Salwinski L., Xenarios I., and Eisenberg D., Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics., 1 (5): 349-56, 2002.
– reference: [23] Ramani AK., Bunescu RC., Mooney RJ., and Marcotte EM., Consolidating the set of known human protein-protein interactions in preparation for large-scale mapping of the human interactome. Genome Biology, 6 (5): R40, 2005.
– reference: [2] Brun C, Chevenet F, Martin D, Wojcik J, Guenoche A, and Jacq B, Functional classification of proteins for the prediction of cellular function from a protein-protein interaction network. Genome Biology, 5 (1): R6, 2003.
– reference: [16] Li H, Li J, and Wong L, Discovery motif pairs at interaction sites from protein sequences on a proteome-wide scale. Bioinformatics, 22 (8): 989-996, 2006.
– reference: [17] Liu G., Lu H., Lou W., Xu Y., Yu X. J., Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree. Data Mining and Knowledge Discovery, 9 (3): 249-274, 2004.
– reference: [29] Yu H, Paccanaro A, Trifonov V, and Gerstein M, Predicting interactions in protein networks by completing defective cliques. Bioinformatics, 22 (7): 823-829, 2006.
– reference: [15] Legrain P, Wojcik J, and Gauthier JM, Protein-protein interaction maps: a lead towards cellular functions. Trends in genetics, 17 (6): 346-352, 2001.
– reference: [3] Chen J, Chua HN, Hsu W, Lee ML, Ng SK, Saito R, Sung WK, and Wong L, Increasing confidence of protein-protein inteactomes. In Proc. of 17th International Conference on Genome Informatics, pp.284-297, 2006.
– reference: [11] Goh CS, Bogan AA, Joachimiak M, Walther D, and Cohen FE., Co-evolution of proteins with their interaction partners. Journal of Molecular Biology, 299 (2): 283-93, 2000.
– reference: [26] Sprinzak E, Sattath S, and Margalit H, How reliable are experimental protein-protein interaction data? Journal of Molecular Biology, 327 (5): 919-923, 2003.
– reference: [27] Tan SH., Hugo W., Sung WK., and Ng SK., A correlated motif approach for finding short linear motifs from protein interaction networks. BMC Bioinformatics, 7: 502, 2006.
– reference: [10] Edwards AM, Kus B, Jansen R, Greenbaum D, Greenblatt J, and Gerstein M, Bridging structural biology and genomics: assessing protein interaction data with known complexes. Trends in Genetics, 18 (10): 529-536, 2002.
– reference: [21] Oliver S, Proteomics: guilt-by-association goes global. Nature, 403: 601-603, 2000.
– reference: [24] Saito R, Suzuki H, and Hayashizaki Y, Interaction generality, a measurement to assess the reliability of a protein-protein interaction. Nucleic Acids Research, 30 (5): 1163-1168, 2002.
– reference: [13] Han D, Kim HS, Seo J, and Jang W, A domain combination based probabilistic framework for protein-protein interaction prediction. Genome Informatics Series: Workshop on Genome Informatics, 14: 250-259, 2003.
– reference: [20] Ng SK, Zhang Z, and Tan SH, Integrative approach for computationally inferring protein domain interactions. Bioinformatics, 19 (8): 923-929, 2003.
– reference: [1] Bock JR, and Gough DA, Predicting protein-protein interactions from primary structure. Bioinformatics, 17 (5): 455-460, 2001.
– reference: [12] Gomez SM, and Rzhetsky A, Towards the prediction of complete protein-protein interaction networks. In Pacific Symposium on Biocomputing, pp.413-424, 2002.
– reference: [5] Chen J, Hsu W, Lee ML, and Ng SK, Increasing confidence of protein interactomes using network topological metrics. Bioinformatics, 22 (16): 1998-2004, 2006.
– reference: [25] Saito R, Suzuki H, and Hayashizaki Y, Construction of reliable protein-protein interaction networks with a new interaction generality measure. Bioinformatics, 19 (6): 756-763, 2002.
– reference: [8] Dandekar, T., Snel, B., Huynen, M., and Bork, P., Conservation of gene order: a fingerprint of proteins that physically interact. Trends in Biochemical Sciences, 23 (9): 324-8, 1998.
SSID ssj0036957
Score 1.8683672
Snippet High-throughput protein interaction data, with ever-increasing volume, are becoming the foundation of many biological discoveries. However, high-throughput...
SourceID proquest
pubmed
jstage
SourceType Aggregation Database
Index Database
Publisher
StartPage 138
SubjectTerms Algorithms
False Negative Reactions
False Positive Reactions
Fungal Proteins - chemistry
Fungal Proteins - genetics
Fungal Proteins - metabolism
Kinetics
Models, Genetic
Models, Theoretical
network topology
Predictive Value of Tests
protein-protein interaction
Proteins - chemistry
Proteins - genetics
Proteins - metabolism
Reproducibility of Results
Yeasts - genetics
Yeasts - metabolism
Title Assessing and Predicting Protein Interactions Using Both Local and Global Network Topological Metrics
URI https://www.jstage.jst.go.jp/article/gi1990/21/0/21_0_138/_article/-char/en
https://www.ncbi.nlm.nih.gov/pubmed/19425154
https://www.proquest.com/docview/66709604
Volume 21
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
ispartofPNX Genome Informatics, 2008, Vol.21, pp.138-149
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lb5wwELbaVJV6qfruNn340NsKijGYRT1FVdv0KVVKpNyQMSYhEg8tcEh_fWZsA5u2kdpe2BUGgz0f47Fn5jMhr1VSlkEZSY8hA2FUFBtPBqnypJSsiEQhpMbk5G_fxeFx9PkkPlncBSa7ZMh99fOPeSX_I1U4B3LFLNl_kOxcKZyA_yBfOIKE4fhXMrYe2ynNsNui02Ww6eUt7mJpyCC2NnWhX4_myhxEszYjmLnJEYI0NhocLNFu1oY1brblYuGd-fpRN22NNCPO1FX9GhuKS7f8YL0Mhv0UeuD7vlmP78_aDv0SroJPSwVzRFA1mjX6sap1tZw0KKuaiwXEUxDx16ruWweqadliR8eChZJ6aWS5oyclbNOknRZllvDld-0e8ghkcloxGEP9kPm_XAhN7Woja5aCLmLuEVf5tKeim-RWmCQMw0C__Jg9T1yklh12eklH9ImPfnPlwYZu1lYF1ss52PKn-vppijFXju6Ru26eQQ8saO6TG7p5QG7bnUcvHhI9Q4cCCugCHeqgQ3ehQw10KEKHGuiYmyx0qIMO3YEOddB5RI4_vD96d-i5HTe8cx6ywYPZtdI5tIqh_7wImdRcxDIRJSt4mcDcPIEBAFniNlxIwcO0UEoXZRmD2RsryR-TvaZt9FNC4TvPRYSDQrmBH7UJ8iBJeByFOi_KRK_IW9tnWWdpVTL3GWW2l7OQZe4AnT0XYi4ifPor8mrq6Az0ITq5ZKPbsc8EEhKKIFqRJ7b_5_onaT27tmSf3LHhQLjC9pzsDdtRvwCbc8hfGpBcAkWChnU
linkProvider Colorado Alliance of Research Libraries
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Assessing+and+predicting+protein+interactions+using+both+local+and+global+network+topological+metrics&rft.jtitle=Genome+informatics+series+%3A+proceedings+of+the+...+Workshop+on+Genome+Informatics&rft.au=Liu%2C+Guimei&rft.au=Li%2C+Jinyan&rft.au=Wong%2C+Limsoon&rft.date=2008&rft.issn=0919-9454&rft.volume=21&rft.spage=138&rft_id=info:doi/10.11234%2Fgi1990.21.138&rft_id=info%3Apmid%2F19425154&rft.externalDocID=19425154
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0919-9454&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0919-9454&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0919-9454&client=summon