An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information

High throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information have been proposed and widely used to understand the biological background. However, the processing of biological information data inevitably pr...

Full description

Saved in:
Bibliographic Details
Published inMathematical biosciences and engineering : MBE Vol. 19; no. 6; pp. 6331 - 6343
Main Authors Zhang, Zhihong, Luo, Yingchun, Jiang, Meiping, Wu, Dongjie, Zhang, Wang, Yan, Wei, Zhao, Bihai
Format Journal Article
LanguageEnglish
Published United States AIMS Press 01.01.2022
Subjects
Online AccessGet full text

Cover

Loading…
Abstract High throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information have been proposed and widely used to understand the biological background. However, the processing of biological information data inevitably produces false positive and false negative data, such as the noise in the Protein-Protein Interaction (PPI) networks and the noise generated by the integration of a variety of biological information. How to solve these noise problems is the key role in essential protein predictions. An Identifying Essential Proteins model based on non-negative Matrix Symmetric tri-Factorization and multiple biological information (IEPMSF) is proposed in this paper, which utilizes only the PPI network proteins common neighbor characters to develop a weighted network, and uses the non-negative matrix symmetric tri-factorization method to find more potential interactions between proteins in the network so as to optimize the weighted network. Then, using the subcellular location and lineal homology information, the starting score of proteins is determined, and the random walk algorithm with restart mode is applied to the optimized network to mark and rank each protein. We tested the suggested forecasting model against current representative approaches using a public database. Experiment shows high efficiency of new method in essential proteins identification. The effectiveness of this method shows that it can dramatically solve the noise problems that existing in the multi-source biological information itself and cased by integrating them.
AbstractList High throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information have been proposed and widely used to understand the biological background. However, the processing of biological information data inevitably produces false positive and false negative data, such as the noise in the Protein-Protein Interaction (PPI) networks and the noise generated by the integration of a variety of biological information. How to solve these noise problems is the key role in essential protein predictions. An Identifying Essential Proteins model based on non-negative Matrix Symmetric tri-Factorization and multiple biological information (IEPMSF) is proposed in this paper, which utilizes only the PPI network proteins common neighbor characters to develop a weighted network, and uses the non-negative matrix symmetric tri-factorization method to find more potential interactions between proteins in the network so as to optimize the weighted network. Then, using the subcellular location and lineal homology information, the starting score of proteins is determined, and the random walk algorithm with restart mode is applied to the optimized network to mark and rank each protein. We tested the suggested forecasting model against current representative approaches using a public database. Experiment shows high efficiency of new method in essential proteins identification. The effectiveness of this method shows that it can dramatically solve the noise problems that existing in the multi-source biological information itself and cased by integrating them.
Author Luo, Yingchun
Wu, Dongjie
Zhao, Bihai
Zhang, Zhihong
Yan, Wei
Zhang, Wang
Jiang, Meiping
Author_xml – sequence: 1
  givenname: Zhihong
  surname: Zhang
  fullname: Zhang, Zhihong
  organization: College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, Hunan 410022, China
– sequence: 2
  givenname: Yingchun
  surname: Luo
  fullname: Luo, Yingchun
  organization: Department of Ultrasound, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, Hunan 410008, China
– sequence: 3
  givenname: Meiping
  surname: Jiang
  fullname: Jiang, Meiping
  organization: Department of Ultrasound, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, Hunan 410008, China
– sequence: 4
  givenname: Dongjie
  surname: Wu
  fullname: Wu, Dongjie
  organization: Department of Banking and Finance, Monash University, Clayton, Victoria 3168, Australia
– sequence: 5
  givenname: Wang
  surname: Zhang
  fullname: Zhang, Wang
  organization: Department of Optoelectronic Engineering, Jinan University, Guangzhou, Guangdong 510632, China
– sequence: 6
  givenname: Wei
  surname: Yan
  fullname: Yan, Wei
  organization: College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, Hunan 410022, China
– sequence: 7
  givenname: Bihai
  surname: Zhao
  fullname: Zhao, Bihai
  organization: College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, Hunan 410022, China
BackLink https://www.ncbi.nlm.nih.gov/pubmed/35603404$$D View this record in MEDLINE/PubMed
BookMark eNpNkUlvHCEQhZHlyMvEp9wjjpHidlgahj5aVuJYspRLckYsxRirGxygD_Mf_KPTs9jKqYqqj_cE7xKdppwAoU-U3PCB998mCzeMMMYGeYIuqBC0I4Sq0__6c3RZ6zMhvOe8P0PnXMilJ_0Fer1NGEKILkJquLZiGmy2OOSCo19GMWxj2mCodXcwI34puUFMFVtTweOc8FOe8pg322tcZ-tgHOfRFDxmZ1pc1ib5t0vdseKYGhTj9vuYFrNpz35EH4IZK1wd6wr9-fH9993P7vHX_cPd7WPnuCCt44Oiogdme2moIMb7nq29sEwqqzhXRHAamCNCMkL92g7ScuvWxgpC5MAEX6GHg67P5lm_lDiZstXZRL0f5LLRprToRtAuGDlYJQM3tgcRrCOBKmC0XysFi90KfTloLW_7O0Nteop19wsmQZ6rZlIqRgciyIJ-PaCu5FoLhHdrSvQuSr1EqY9RLvTno_BsJ_Dv7Ft2_B9IHZ4T
Cites_doi 10.1016/S0022-5193(03)00071-7
10.1093/nar/30.1.303
10.1016/j.ymeth.2014.02.016
10.1186/1471-2105-8-236
10.1186/s12859-016-1115-5
10.1093/nar/gkp952
10.1111/j.1365-2648.2001.01648.x
10.1093/nar/gkj148
10.3934/mbe.2019094
10.1186/1752-0509-6-15
10.1093/bib/bbab332
10.1186/1752-0509-6-87
10.1186/s40246-016-0087-x
10.1007/s00521-021-06014-6
10.3934/mbe.2022100
10.1093/molbev/msi072
10.1093/nar/gkn858
10.1016/j.neucom.2019.09.080
10.1093/bioinformatics/btg415
10.1109/TCBB.2011.147
10.1093/database/bar009
10.1093/database/bau012
10.1093/nar/26.1.73
10.1093/nar/gkp931
10.1103/PhysRevE.71.056103
10.1109/TNB.2014.2337912
10.1093/nar/gkr1030
10.1155/JBB.2005.96
10.1186/s12859-021-04175-8
10.1093/nar/gkr974
10.1371/journal.pone.0058763
10.3390/genes10010031
10.1145/1150402.1150420
ContentType Journal Article
DBID NPM
AAYXX
CITATION
7X8
DOA
DOI 10.3934/mbe.2022296
DatabaseName PubMed
CrossRef
MEDLINE - Academic
DOAJ Directory of Open Access Journals
DatabaseTitle PubMed
CrossRef
MEDLINE - Academic
DatabaseTitleList
CrossRef
PubMed
Database_xml – sequence: 1
  dbid: DOA
  name: Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1551-0018
EndPage 6343
ExternalDocumentID oai_doaj_org_article_cfa69b86f3ab4e5fbc0f18e214788eb8
10_3934_mbe_2022296
35603404
Genre Journal Article
GroupedDBID ---
53G
5GY
AENEX
ALMA_UNASSIGNED_HOLDINGS
EBD
EBS
EJD
EMOBN
F5P
GROUPED_DOAJ
IAO
ITC
J9A
ML0
NPM
OK1
P2P
RAN
SV3
TUS
AAYXX
CITATION
7X8
ID FETCH-LOGICAL-c350t-398154e2b46a150add427d5b268b83380531f2c056201d7b96b3bc7ab50069253
IEDL.DBID DOA
ISSN 1551-0018
IngestDate Tue Oct 22 15:10:15 EDT 2024
Fri Oct 25 08:12:42 EDT 2024
Thu Sep 26 18:19:59 EDT 2024
Wed Oct 16 00:41:09 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords protein-protein interaction
essential protein
multiple biological information
subcellular location information
homology information
non-negative matrix symmetric tri-factorization
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c350t-398154e2b46a150add427d5b268b83380531f2c056201d7b96b3bc7ab50069253
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://doaj.org/article/cfa69b86f3ab4e5fbc0f18e214788eb8
PMID 35603404
PQID 2668219050
PQPubID 23479
PageCount 13
ParticipantIDs doaj_primary_oai_doaj_org_article_cfa69b86f3ab4e5fbc0f18e214788eb8
proquest_miscellaneous_2668219050
crossref_primary_10_3934_mbe_2022296
pubmed_primary_35603404
PublicationCentury 2000
PublicationDate 2022-01-01
PublicationDateYYYYMMDD 2022-01-01
PublicationDate_xml – month: 01
  year: 2022
  text: 2022-01-01
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Mathematical biosciences and engineering : MBE
PublicationTitleAlternate Math Biosci Eng
PublicationYear 2022
Publisher AIMS Press
Publisher_xml – name: AIMS Press
References key-10.3934/mbe.2022296-20
key-10.3934/mbe.2022296-15
key-10.3934/mbe.2022296-14
key-10.3934/mbe.2022296-17
key-10.3934/mbe.2022296-16
key-10.3934/mbe.2022296-9
key-10.3934/mbe.2022296-11
key-10.3934/mbe.2022296-33
key-10.3934/mbe.2022296-10
key-10.3934/mbe.2022296-32
key-10.3934/mbe.2022296-13
key-10.3934/mbe.2022296-12
key-10.3934/mbe.2022296-5
key-10.3934/mbe.2022296-6
key-10.3934/mbe.2022296-7
key-10.3934/mbe.2022296-8
key-10.3934/mbe.2022296-1
key-10.3934/mbe.2022296-19
key-10.3934/mbe.2022296-2
key-10.3934/mbe.2022296-18
key-10.3934/mbe.2022296-3
key-10.3934/mbe.2022296-4
key-10.3934/mbe.2022296-31
key-10.3934/mbe.2022296-30
key-10.3934/mbe.2022296-26
key-10.3934/mbe.2022296-25
key-10.3934/mbe.2022296-28
key-10.3934/mbe.2022296-27
key-10.3934/mbe.2022296-22
key-10.3934/mbe.2022296-21
key-10.3934/mbe.2022296-24
key-10.3934/mbe.2022296-23
key-10.3934/mbe.2022296-29
References_xml – ident: key-10.3934/mbe.2022296-4
  doi: 10.1016/S0022-5193(03)00071-7
– ident: key-10.3934/mbe.2022296-27
  doi: 10.1093/nar/30.1.303
– ident: key-10.3934/mbe.2022296-1
  doi: 10.1016/j.ymeth.2014.02.016
– ident: key-10.3934/mbe.2022296-21
  doi: 10.1186/1471-2105-8-236
– ident: key-10.3934/mbe.2022296-22
  doi: 10.1186/s12859-016-1115-5
– ident: key-10.3934/mbe.2022296-31
  doi: 10.1093/nar/gkp952
– ident: key-10.3934/mbe.2022296-3
  doi: 10.1111/j.1365-2648.2001.01648.x
– ident: key-10.3934/mbe.2022296-23
  doi: 10.1093/nar/gkj148
– ident: key-10.3934/mbe.2022296-18
  doi: 10.3934/mbe.2019094
– ident: key-10.3934/mbe.2022296-9
  doi: 10.1186/1752-0509-6-15
– ident: key-10.3934/mbe.2022296-17
  doi: 10.1093/bib/bbab332
– ident: key-10.3934/mbe.2022296-26
  doi: 10.1186/1752-0509-6-87
– ident: key-10.3934/mbe.2022296-20
  doi: 10.1186/s40246-016-0087-x
– ident: key-10.3934/mbe.2022296-14
  doi: 10.1007/s00521-021-06014-6
– ident: key-10.3934/mbe.2022296-16
  doi: 10.3934/mbe.2022100
– ident: key-10.3934/mbe.2022296-2
  doi: 10.1093/molbev/msi072
– ident: key-10.3934/mbe.2022296-25
  doi: 10.1093/nar/gkn858
– ident: key-10.3934/mbe.2022296-15
  doi: 10.1016/j.neucom.2019.09.080
– ident: key-10.3934/mbe.2022296-19
  doi: 10.1093/bioinformatics/btg415
– ident: key-10.3934/mbe.2022296-7
  doi: 10.1109/TCBB.2011.147
– ident: key-10.3934/mbe.2022296-30
  doi: 10.1093/database/bar009
– ident: key-10.3934/mbe.2022296-28
  doi: 10.1093/database/bau012
– ident: key-10.3934/mbe.2022296-24
  doi: 10.1093/nar/26.1.73
– ident: key-10.3934/mbe.2022296-33
  doi: 10.1093/nar/gkp931
– ident: key-10.3934/mbe.2022296-5
  doi: 10.1103/PhysRevE.71.056103
– ident: key-10.3934/mbe.2022296-10
  doi: 10.1109/TNB.2014.2337912
– ident: key-10.3934/mbe.2022296-32
  doi: 10.1093/nar/gkr1030
– ident: key-10.3934/mbe.2022296-6
  doi: 10.1155/JBB.2005.96
– ident: key-10.3934/mbe.2022296-12
  doi: 10.1186/s12859-021-04175-8
– ident: key-10.3934/mbe.2022296-29
  doi: 10.1093/nar/gkr974
– ident: key-10.3934/mbe.2022296-8
  doi: 10.1371/journal.pone.0058763
– ident: key-10.3934/mbe.2022296-11
  doi: 10.3390/genes10010031
– ident: key-10.3934/mbe.2022296-13
  doi: 10.1145/1150402.1150420
SSID ssj0034334
Score 2.2932255
Snippet High throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information...
SourceID doaj
proquest
crossref
pubmed
SourceType Open Website
Aggregation Database
Index Database
StartPage 6331
SubjectTerms essential protein
homology information
multiple biological information
non-negative matrix symmetric tri-factorization
protein-protein interaction
subcellular location information
Title An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information
URI https://www.ncbi.nlm.nih.gov/pubmed/35603404
https://search.proquest.com/docview/2668219050
https://doaj.org/article/cfa69b86f3ab4e5fbc0f18e214788eb8
Volume 19
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA4iCF7Et-uLCB4tdjtJmx5VXBZBTwreSiZN0INdcXcP_gd_tDNJV_QgXjz1QZum8zWZL2nmGyFOwZATqgwvCrM6U9apzBS2zVq0FRoLEJDnO27vyvGDunnUj99SffGasCQPnAx37oItazRlAIvK64AuD0PjOb2OMR5TmG9eLwZTqQ8GBaBSNB7UoM5fkBUxOXV1-cP_RJn-37ll9DGjdbHWk0N5kSq1IZZ8tylWUrrI9y3xcdFJHyUfyFPIadKVfZdEO-VzjLeNMUuS1cDpgAqKIgzP3VSyr2rlpJNPk5dY2JmczpEn7XkVqmSHxgBJ27WLm7J-K1lQ4i2FP8heZpX3t8XD6Pr-apz12RQyBzqfZVAboku-QFVaYoHUr6miajUWpUFDA1VujaFwTIjyYVthXSKgqyxqVjMuNOyI5W7S-T0htQr8h08jenJuHjC0VcAanCsL58JwIE4XNm5ek2hGQ4MNhqIhKJoeioG4ZPt_XcJK1_EE4d_0-Dd_4T8QJwv0GmoZbDnb-cl82hD1MNQf5zofiN0E69ejgIgeqFzt_0cVDsQqv1GamDkUy7O3uT8iqjLD4_hVfgIzO-vG
link.rule.ids 315,783,787,867,2109,27936,27937
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+efficient+strategy+for+identifying+essential+proteins+based+on+homology%2C+subcellular+location+and+protein-protein+interaction+information&rft.jtitle=Mathematical+biosciences+and+engineering+%3A+MBE&rft.au=Zhang%2C+Zhihong&rft.au=Luo%2C+Yingchun&rft.au=Jiang%2C+Meiping&rft.au=Wu%2C+Dongjie&rft.date=2022-01-01&rft.eissn=1551-0018&rft.volume=19&rft.issue=6&rft.spage=6331&rft_id=info:doi/10.3934%2Fmbe.2022296&rft_id=info%3Apmid%2F35603404&rft.externalDocID=35603404
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1551-0018&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1551-0018&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1551-0018&client=summon