An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information
High throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information have been proposed and widely used to understand the biological background. However, the processing of biological information data inevitably pr...
Saved in:
Published in | Mathematical biosciences and engineering : MBE Vol. 19; no. 6; pp. 6331 - 6343 |
---|---|
Main Authors | , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
AIMS Press
01.01.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | High throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information have been proposed and widely used to understand the biological background. However, the processing of biological information data inevitably produces false positive and false negative data, such as the noise in the Protein-Protein Interaction (PPI) networks and the noise generated by the integration of a variety of biological information. How to solve these noise problems is the key role in essential protein predictions. An Identifying Essential Proteins model based on non-negative Matrix Symmetric tri-Factorization and multiple biological information (IEPMSF) is proposed in this paper, which utilizes only the PPI network proteins common neighbor characters to develop a weighted network, and uses the non-negative matrix symmetric tri-factorization method to find more potential interactions between proteins in the network so as to optimize the weighted network. Then, using the subcellular location and lineal homology information, the starting score of proteins is determined, and the random walk algorithm with restart mode is applied to the optimized network to mark and rank each protein. We tested the suggested forecasting model against current representative approaches using a public database. Experiment shows high efficiency of new method in essential proteins identification. The effectiveness of this method shows that it can dramatically solve the noise problems that existing in the multi-source biological information itself and cased by integrating them. |
---|---|
AbstractList | High throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information have been proposed and widely used to understand the biological background. However, the processing of biological information data inevitably produces false positive and false negative data, such as the noise in the Protein-Protein Interaction (PPI) networks and the noise generated by the integration of a variety of biological information. How to solve these noise problems is the key role in essential protein predictions. An Identifying Essential Proteins model based on non-negative Matrix Symmetric tri-Factorization and multiple biological information (IEPMSF) is proposed in this paper, which utilizes only the PPI network proteins common neighbor characters to develop a weighted network, and uses the non-negative matrix symmetric tri-factorization method to find more potential interactions between proteins in the network so as to optimize the weighted network. Then, using the subcellular location and lineal homology information, the starting score of proteins is determined, and the random walk algorithm with restart mode is applied to the optimized network to mark and rank each protein. We tested the suggested forecasting model against current representative approaches using a public database. Experiment shows high efficiency of new method in essential proteins identification. The effectiveness of this method shows that it can dramatically solve the noise problems that existing in the multi-source biological information itself and cased by integrating them. |
Author | Luo, Yingchun Wu, Dongjie Zhao, Bihai Zhang, Zhihong Yan, Wei Zhang, Wang Jiang, Meiping |
Author_xml | – sequence: 1 givenname: Zhihong surname: Zhang fullname: Zhang, Zhihong organization: College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, Hunan 410022, China – sequence: 2 givenname: Yingchun surname: Luo fullname: Luo, Yingchun organization: Department of Ultrasound, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, Hunan 410008, China – sequence: 3 givenname: Meiping surname: Jiang fullname: Jiang, Meiping organization: Department of Ultrasound, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, Hunan 410008, China – sequence: 4 givenname: Dongjie surname: Wu fullname: Wu, Dongjie organization: Department of Banking and Finance, Monash University, Clayton, Victoria 3168, Australia – sequence: 5 givenname: Wang surname: Zhang fullname: Zhang, Wang organization: Department of Optoelectronic Engineering, Jinan University, Guangzhou, Guangdong 510632, China – sequence: 6 givenname: Wei surname: Yan fullname: Yan, Wei organization: College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, Hunan 410022, China – sequence: 7 givenname: Bihai surname: Zhao fullname: Zhao, Bihai organization: College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, Hunan 410022, China |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/35603404$$D View this record in MEDLINE/PubMed |
BookMark | eNpNkUlvHCEQhZHlyMvEp9wjjpHidlgahj5aVuJYspRLckYsxRirGxygD_Mf_KPTs9jKqYqqj_cE7xKdppwAoU-U3PCB998mCzeMMMYGeYIuqBC0I4Sq0__6c3RZ6zMhvOe8P0PnXMilJ_0Fer1NGEKILkJquLZiGmy2OOSCo19GMWxj2mCodXcwI34puUFMFVtTweOc8FOe8pg322tcZ-tgHOfRFDxmZ1pc1ib5t0vdseKYGhTj9vuYFrNpz35EH4IZK1wd6wr9-fH9993P7vHX_cPd7WPnuCCt44Oiogdme2moIMb7nq29sEwqqzhXRHAamCNCMkL92g7ScuvWxgpC5MAEX6GHg67P5lm_lDiZstXZRL0f5LLRprToRtAuGDlYJQM3tgcRrCOBKmC0XysFi90KfTloLW_7O0Nteop19wsmQZ6rZlIqRgciyIJ-PaCu5FoLhHdrSvQuSr1EqY9RLvTno_BsJ_Dv7Ft2_B9IHZ4T |
Cites_doi | 10.1016/S0022-5193(03)00071-7 10.1093/nar/30.1.303 10.1016/j.ymeth.2014.02.016 10.1186/1471-2105-8-236 10.1186/s12859-016-1115-5 10.1093/nar/gkp952 10.1111/j.1365-2648.2001.01648.x 10.1093/nar/gkj148 10.3934/mbe.2019094 10.1186/1752-0509-6-15 10.1093/bib/bbab332 10.1186/1752-0509-6-87 10.1186/s40246-016-0087-x 10.1007/s00521-021-06014-6 10.3934/mbe.2022100 10.1093/molbev/msi072 10.1093/nar/gkn858 10.1016/j.neucom.2019.09.080 10.1093/bioinformatics/btg415 10.1109/TCBB.2011.147 10.1093/database/bar009 10.1093/database/bau012 10.1093/nar/26.1.73 10.1093/nar/gkp931 10.1103/PhysRevE.71.056103 10.1109/TNB.2014.2337912 10.1093/nar/gkr1030 10.1155/JBB.2005.96 10.1186/s12859-021-04175-8 10.1093/nar/gkr974 10.1371/journal.pone.0058763 10.3390/genes10010031 10.1145/1150402.1150420 |
ContentType | Journal Article |
DBID | NPM AAYXX CITATION 7X8 DOA |
DOI | 10.3934/mbe.2022296 |
DatabaseName | PubMed CrossRef MEDLINE - Academic DOAJ Directory of Open Access Journals |
DatabaseTitle | PubMed CrossRef MEDLINE - Academic |
DatabaseTitleList | CrossRef PubMed |
Database_xml | – sequence: 1 dbid: DOA name: Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1551-0018 |
EndPage | 6343 |
ExternalDocumentID | oai_doaj_org_article_cfa69b86f3ab4e5fbc0f18e214788eb8 10_3934_mbe_2022296 35603404 |
Genre | Journal Article |
GroupedDBID | --- 53G 5GY AENEX ALMA_UNASSIGNED_HOLDINGS EBD EBS EJD EMOBN F5P GROUPED_DOAJ IAO ITC J9A ML0 NPM OK1 P2P RAN SV3 TUS AAYXX CITATION 7X8 |
ID | FETCH-LOGICAL-c350t-398154e2b46a150add427d5b268b83380531f2c056201d7b96b3bc7ab50069253 |
IEDL.DBID | DOA |
ISSN | 1551-0018 |
IngestDate | Tue Oct 22 15:10:15 EDT 2024 Fri Oct 25 08:12:42 EDT 2024 Thu Sep 26 18:19:59 EDT 2024 Wed Oct 16 00:41:09 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 6 |
Keywords | protein-protein interaction essential protein multiple biological information subcellular location information homology information non-negative matrix symmetric tri-factorization |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c350t-398154e2b46a150add427d5b268b83380531f2c056201d7b96b3bc7ab50069253 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
OpenAccessLink | https://doaj.org/article/cfa69b86f3ab4e5fbc0f18e214788eb8 |
PMID | 35603404 |
PQID | 2668219050 |
PQPubID | 23479 |
PageCount | 13 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_cfa69b86f3ab4e5fbc0f18e214788eb8 proquest_miscellaneous_2668219050 crossref_primary_10_3934_mbe_2022296 pubmed_primary_35603404 |
PublicationCentury | 2000 |
PublicationDate | 2022-01-01 |
PublicationDateYYYYMMDD | 2022-01-01 |
PublicationDate_xml | – month: 01 year: 2022 text: 2022-01-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | Mathematical biosciences and engineering : MBE |
PublicationTitleAlternate | Math Biosci Eng |
PublicationYear | 2022 |
Publisher | AIMS Press |
Publisher_xml | – name: AIMS Press |
References | key-10.3934/mbe.2022296-20 key-10.3934/mbe.2022296-15 key-10.3934/mbe.2022296-14 key-10.3934/mbe.2022296-17 key-10.3934/mbe.2022296-16 key-10.3934/mbe.2022296-9 key-10.3934/mbe.2022296-11 key-10.3934/mbe.2022296-33 key-10.3934/mbe.2022296-10 key-10.3934/mbe.2022296-32 key-10.3934/mbe.2022296-13 key-10.3934/mbe.2022296-12 key-10.3934/mbe.2022296-5 key-10.3934/mbe.2022296-6 key-10.3934/mbe.2022296-7 key-10.3934/mbe.2022296-8 key-10.3934/mbe.2022296-1 key-10.3934/mbe.2022296-19 key-10.3934/mbe.2022296-2 key-10.3934/mbe.2022296-18 key-10.3934/mbe.2022296-3 key-10.3934/mbe.2022296-4 key-10.3934/mbe.2022296-31 key-10.3934/mbe.2022296-30 key-10.3934/mbe.2022296-26 key-10.3934/mbe.2022296-25 key-10.3934/mbe.2022296-28 key-10.3934/mbe.2022296-27 key-10.3934/mbe.2022296-22 key-10.3934/mbe.2022296-21 key-10.3934/mbe.2022296-24 key-10.3934/mbe.2022296-23 key-10.3934/mbe.2022296-29 |
References_xml | – ident: key-10.3934/mbe.2022296-4 doi: 10.1016/S0022-5193(03)00071-7 – ident: key-10.3934/mbe.2022296-27 doi: 10.1093/nar/30.1.303 – ident: key-10.3934/mbe.2022296-1 doi: 10.1016/j.ymeth.2014.02.016 – ident: key-10.3934/mbe.2022296-21 doi: 10.1186/1471-2105-8-236 – ident: key-10.3934/mbe.2022296-22 doi: 10.1186/s12859-016-1115-5 – ident: key-10.3934/mbe.2022296-31 doi: 10.1093/nar/gkp952 – ident: key-10.3934/mbe.2022296-3 doi: 10.1111/j.1365-2648.2001.01648.x – ident: key-10.3934/mbe.2022296-23 doi: 10.1093/nar/gkj148 – ident: key-10.3934/mbe.2022296-18 doi: 10.3934/mbe.2019094 – ident: key-10.3934/mbe.2022296-9 doi: 10.1186/1752-0509-6-15 – ident: key-10.3934/mbe.2022296-17 doi: 10.1093/bib/bbab332 – ident: key-10.3934/mbe.2022296-26 doi: 10.1186/1752-0509-6-87 – ident: key-10.3934/mbe.2022296-20 doi: 10.1186/s40246-016-0087-x – ident: key-10.3934/mbe.2022296-14 doi: 10.1007/s00521-021-06014-6 – ident: key-10.3934/mbe.2022296-16 doi: 10.3934/mbe.2022100 – ident: key-10.3934/mbe.2022296-2 doi: 10.1093/molbev/msi072 – ident: key-10.3934/mbe.2022296-25 doi: 10.1093/nar/gkn858 – ident: key-10.3934/mbe.2022296-15 doi: 10.1016/j.neucom.2019.09.080 – ident: key-10.3934/mbe.2022296-19 doi: 10.1093/bioinformatics/btg415 – ident: key-10.3934/mbe.2022296-7 doi: 10.1109/TCBB.2011.147 – ident: key-10.3934/mbe.2022296-30 doi: 10.1093/database/bar009 – ident: key-10.3934/mbe.2022296-28 doi: 10.1093/database/bau012 – ident: key-10.3934/mbe.2022296-24 doi: 10.1093/nar/26.1.73 – ident: key-10.3934/mbe.2022296-33 doi: 10.1093/nar/gkp931 – ident: key-10.3934/mbe.2022296-5 doi: 10.1103/PhysRevE.71.056103 – ident: key-10.3934/mbe.2022296-10 doi: 10.1109/TNB.2014.2337912 – ident: key-10.3934/mbe.2022296-32 doi: 10.1093/nar/gkr1030 – ident: key-10.3934/mbe.2022296-6 doi: 10.1155/JBB.2005.96 – ident: key-10.3934/mbe.2022296-12 doi: 10.1186/s12859-021-04175-8 – ident: key-10.3934/mbe.2022296-29 doi: 10.1093/nar/gkr974 – ident: key-10.3934/mbe.2022296-8 doi: 10.1371/journal.pone.0058763 – ident: key-10.3934/mbe.2022296-11 doi: 10.3390/genes10010031 – ident: key-10.3934/mbe.2022296-13 doi: 10.1145/1150402.1150420 |
SSID | ssj0034334 |
Score | 2.2932255 |
Snippet | High throughput biological experiments are expensive and time consuming. For the past few years, many computational methods based on biological information... |
SourceID | doaj proquest crossref pubmed |
SourceType | Open Website Aggregation Database Index Database |
StartPage | 6331 |
SubjectTerms | essential protein homology information multiple biological information non-negative matrix symmetric tri-factorization protein-protein interaction subcellular location information |
Title | An efficient strategy for identifying essential proteins based on homology, subcellular location and protein-protein interaction information |
URI | https://www.ncbi.nlm.nih.gov/pubmed/35603404 https://search.proquest.com/docview/2668219050 https://doaj.org/article/cfa69b86f3ab4e5fbc0f18e214788eb8 |
Volume | 19 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LS8QwEA4iCF7Et-uLCB4tdjtJmx5VXBZBTwreSiZN0INdcXcP_gd_tDNJV_QgXjz1QZum8zWZL2nmGyFOwZATqgwvCrM6U9apzBS2zVq0FRoLEJDnO27vyvGDunnUj99SffGasCQPnAx37oItazRlAIvK64AuD0PjOb2OMR5TmG9eLwZTqQ8GBaBSNB7UoM5fkBUxOXV1-cP_RJn-37ll9DGjdbHWk0N5kSq1IZZ8tylWUrrI9y3xcdFJHyUfyFPIadKVfZdEO-VzjLeNMUuS1cDpgAqKIgzP3VSyr2rlpJNPk5dY2JmczpEn7XkVqmSHxgBJ27WLm7J-K1lQ4i2FP8heZpX3t8XD6Pr-apz12RQyBzqfZVAboku-QFVaYoHUr6miajUWpUFDA1VujaFwTIjyYVthXSKgqyxqVjMuNOyI5W7S-T0htQr8h08jenJuHjC0VcAanCsL58JwIE4XNm5ek2hGQ4MNhqIhKJoeioG4ZPt_XcJK1_EE4d_0-Dd_4T8QJwv0GmoZbDnb-cl82hD1MNQf5zofiN0E69ejgIgeqFzt_0cVDsQqv1GamDkUy7O3uT8iqjLD4_hVfgIzO-vG |
link.rule.ids | 315,783,787,867,2109,27936,27937 |
linkProvider | Directory of Open Access Journals |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+efficient+strategy+for+identifying+essential+proteins+based+on+homology%2C+subcellular+location+and+protein-protein+interaction+information&rft.jtitle=Mathematical+biosciences+and+engineering+%3A+MBE&rft.au=Zhang%2C+Zhihong&rft.au=Luo%2C+Yingchun&rft.au=Jiang%2C+Meiping&rft.au=Wu%2C+Dongjie&rft.date=2022-01-01&rft.eissn=1551-0018&rft.volume=19&rft.issue=6&rft.spage=6331&rft_id=info:doi/10.3934%2Fmbe.2022296&rft_id=info%3Apmid%2F35603404&rft.externalDocID=35603404 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1551-0018&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1551-0018&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1551-0018&client=summon |