Porpoise: a new approach for accurate prediction of RNA pseudouridine sites

Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental appro...

Full description

Saved in:

Bibliographic Details
Published in	Briefings in bioinformatics Vol. 22; no. 6
Main Authors	Li, Fuyi, Guo, Xudong, Jin, Peipei, Chen, Jinxiang, Xiang, Dongxu, Song, Jiangning, Coin, Lachlan J M
Format	Journal Article
Language	English
Published	England Oxford University Press 05.11.2021
Subjects	Algorithms Computational Biology - methods Machine Learning Problem Solving Protocol Pseudouridine - chemistry Pseudouridine - genetics Reproducibility of Results RNA - chemistry RNA - genetics Sequence Analysis, RNA - methods ebioinformatics RNA pseudouridine sit sequence analysis machine learning stacking ensemble learning
Online Access	Get full text

Cover

Loading…

Abstract	Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.
AbstractList	Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis. Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis. Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k -tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.
Author	Li, Fuyi Jin, Peipei Song, Jiangning Xiang, Dongxu Guo, Xudong Chen, Jinxiang Coin, Lachlan J M
Author_xml	– sequence: 1 givenname: Fuyi surname: Li fullname: Li, Fuyi organization: Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, the University of Melbourne, Australia – sequence: 2 givenname: Xudong surname: Guo fullname: Guo, Xudong organization: Ningxia University, China – sequence: 3 givenname: Peipei surname: Jin fullname: Jin, Peipei organization: Department of Clinical Laboratory of Ruijin Hospital, affiliated with Shanghai Jiao Tong University School of Medicine, Shanghai, China – sequence: 4 givenname: Jinxiang surname: Chen fullname: Chen, Jinxiang organization: Northwest A&F University, China – sequence: 5 givenname: Dongxu surname: Xiang fullname: Xiang, Dongxu organization: Faculty of Engineering and Information Technology, The University of Melbourne, Australia – sequence: 6 givenname: Jiangning surname: Song fullname: Song, Jiangning organization: Monash Biomedicine Discovery Institute, Monash University, Australia – sequence: 7 givenname: Lachlan J M surname: Coin fullname: Coin, Lachlan J M organization: Department of Microbiology and Immunology at the University of Melbourne, Australia
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/34226915$$D View this record in MEDLINE/PubMed
BookMark	eNptkd1LHTEQxUNR6kd96rvkUZCtSTbZJH0QRGwrFS2iz2GSO6spezdrsmvxv3cv3korPs3A_ObMcM4O2ehTj4R85uwLZ7Y-8tEfeQ9eSPWBbHOpdSWZkhurvtGVkk29RXZK-c2YYNrwj2SrlkI0lqtt8vNXykOKBb9SoD3-oTAMOUG4p23KFEKYMoxIh4yLGMaYeppaen15QoeC0yJNOS5ij7TEEcsnstlCV3BvXXfJ7bezm9Mf1cXV9_PTk4sq1IaPVTBGcx0Qees18xqEUrqRlltsagEmtADCKBGklUEq6601oJlAEYL3Rte75PhFd5j8EhcB-zFD54Ycl5CfXILo_p_08d7dpUdnlFaMmVngYC2Q08OEZXTLWAJ2HfSYpuKEksbyWko1o_v_3no98tfBGTh8AUJOpWRsXxHO3CofN-fj1vnMNH9DhzjCytf50di9u_MM-YeVtA
CitedBy_id	crossref_primary_10_1109_TCBB_2024_3389094 crossref_primary_10_3390_ijms25052869 crossref_primary_10_1093_bib_bbae169 crossref_primary_10_1080_07391102_2024_2318482 crossref_primary_10_31083_j_fbl2709269 crossref_primary_10_1007_s12539_022_00520_4 crossref_primary_10_1109_TCBB_2024_3467093 crossref_primary_10_1016_j_jbc_2024_107140 crossref_primary_10_1109_TCBB_2023_3283985 crossref_primary_10_1093_bib_bbad372 crossref_primary_10_1016_j_csbj_2024_04_052 crossref_primary_10_1093_bib_bbad209 crossref_primary_10_1038_s41587_024_02135_0 crossref_primary_10_1016_j_jmb_2022_167604 crossref_primary_10_1016_j_isci_2022_104883 crossref_primary_10_1016_j_ymthe_2022_05_001 crossref_primary_10_1186_s12859_024_05744_3 crossref_primary_10_1093_bib_bbad083 crossref_primary_10_3389_fncel_2022_1058083 crossref_primary_10_3390_math11030602 crossref_primary_10_1016_j_csbj_2022_01_019 crossref_primary_10_1186_s13059_021_02557_y crossref_primary_10_1016_j_chemolab_2023_104847 crossref_primary_10_1093_bib_bbab461 crossref_primary_10_1093_bib_bbac031 crossref_primary_10_1371_journal_pone_0290538 crossref_primary_10_3390_genes13040677 crossref_primary_10_1016_j_compbiomed_2022_106368 crossref_primary_10_1109_TCBB_2021_3136905 crossref_primary_10_1093_bib_bbac467 crossref_primary_10_1016_j_ijbiomac_2023_124228 crossref_primary_10_1016_j_compbiomed_2023_107155 crossref_primary_10_1016_j_compbiomed_2023_107032 crossref_primary_10_1109_TCBB_2024_3418490 crossref_primary_10_1093_nar_gkae782 crossref_primary_10_12677_HJBM_2022_122014 crossref_primary_10_3389_fgene_2023_1121694 crossref_primary_10_3934_mbe_2022644 crossref_primary_10_1016_j_ymeth_2022_03_001
Cites_doi	10.1093/nar/gkz074 10.1093/bib/bbz022 10.1016/j.csbj.2020.07.010 10.1186/s12859-018-2321-0 10.1093/nar/gkab122 10.1016/j.celrep.2014.07.004 10.1038/srep34595 10.1038/nchembio.1836 10.1093/nar/gkv1036 10.1080/07391102.1998.10509006 10.1038/nature13802 10.21105/joss.00638 10.1038/onc.2011.449 10.1109/ACCESS.2020.2989469 10.3389/fbioe.2020.00134 10.1016/j.tibs.2013.01.002 10.1109/TPAMI.2005.159 10.1093/nar/gkaa692 10.1093/bioinformatics/btu852 10.1093/emboj/cdg191 10.1016/j.omtn.2020.08.022 10.1007/s00438-019-01600-9 10.1093/bib/bbaa275 10.1093/bioinformatics/bty653 10.1093/bioinformatics/btv366 10.1093/bib/bbz041 10.1093/bioinformatics/btz721 10.1109/CONFLUENCE.2017.7943141 10.1016/j.molcel.2011.09.017 10.1016/j.omtn.2019.03.010 10.1016/j.gpb.2019.08.002 10.1093/bib/bbaa049 10.3389/fgene.2020.00088 10.1093/bib/bbaa124 10.1080/152165400410182
ContentType	Journal Article
Copyright	The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com. The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2021
Copyright_xml	– notice: The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com. – notice: The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2021
DBID	AAYXX CITATION CGR CUY CVF ECM EIF NPM 7X8 5PM
DOI	10.1093/bib/bbab245
DatabaseName	CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic PubMed Central (Full Participant titles)
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic
DatabaseTitleList	MEDLINE MEDLINE - Academic CrossRef
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Biology
EISSN	1477-4054
ExternalDocumentID	PMC8575008 34226915 10_1093_bib_bbab245
Genre	Research Support, Non-U.S. Gov't Journal Article Research Support, N.I.H., Extramural
GrantInformation_xml	– fundername: NIAID NIH HHS grantid: R01 AI111965 – fundername: ; – fundername: ; grantid: R01 AI111965 – fundername: ; grantid: GNT1195743 – fundername: ; grantid: LP110200333; DP120104460 – fundername: ; grantid: APP1127948; APP1144652 – fundername: ; grantid: APP1103384
GroupedDBID	--- -E4 .2P .I3 0R~ 23N 2WC 36B 4.4 48X 53G 5GY 5VS 6J9 70D 8VB AAHBH AAIJN AAIMJ AAJKP AAMDB AAMVS AAOGV AAPQZ AAPXW AARHZ AAVAP AAVLN AAYXX ABDBF ABEJV ABEUO ABGNP ABIXL ABNKS ABPQP ABPTD ABQLI ABWST ABXVV ABXZS ABZBJ ACGFO ACGFS ACGOD ACIWK ACPRK ACUFI ACUHS ACUXJ ACYTK ADBBV ADEYI ADFTL ADGKP ADGZP ADHKW ADHZD ADOCK ADPDF ADQBN ADRDM ADRTK ADVEK ADYVW ADZTZ ADZXQ AECKG AEGPL AEGXH AEJOX AEKKA AEKSI AELWJ AEMDU AEMOZ AENEX AENZO AEPUE AETBJ AEWNT AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQXC AGSYK AHGBF AHMBA AHQJS AHXPO AIAGR AIJHB AJEEA AJEUX AKHUL AKVCP AKWXX ALMA_UNASSIGNED_HOLDINGS ALTZX ALUQC ALXQX AMNDL ANAKG APIBT APWMN ARIXL AXUDD AYOIW AZVOD BAWUL BAYMD BEYMZ BHONS BQDIO BQUQU BSWAC BTQHN C45 CDBKE CITATION CS3 CZ4 DAKXR DIK DILTD DU5 D~K E3Z EAD EAP EAS EBA EBC EBD EBR EBS EBU EE~ EMB EMK EMOBN EST ESX F5P F9B FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GX1 H13 H5~ HAR HW0 HZ~ IOX J21 JXSIZ K1G KBUDW KOP KSI KSN M-Z MK~ ML0 N9A NGC NLBLG NMDNZ NOMLY O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PEELM PQQKQ Q1. Q5Y QWB RD5 RPM RUSNO RW1 RXO SV3 TEORI TH9 TJP TLC TOX TR2 TUS W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ZL0 ~91 ADRIX AFXEN BCRHZ CGR CUY CVF ECM EIF GROUPED_DOAJ M49 NPM ROX 7X8 5PM
ID	FETCH-LOGICAL-c381t-c88717cee1fb70b7a255764919e632a8cfaa2852c494c459b998a702e2ccbb873
ISSN	1467-5463 1477-4054
IngestDate	Thu Aug 21 18:29:46 EDT 2025 Fri Jul 11 07:29:01 EDT 2025 Wed Feb 19 02:27:39 EST 2025 Thu Apr 24 22:59:18 EDT 2025 Tue Jul 01 03:39:35 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	6
Keywords	ebioinformatics RNA pseudouridine sit sequence analysis machine learning stacking ensemble learning
Language	English
License	https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com. This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c381t-c88717cee1fb70b7a255764919e632a8cfaa2852c494c459b998a702e2ccbb873
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Fuyi Li and Xudong Guo authors contributed equally to this work
OpenAccessLink	https://www.ncbi.nlm.nih.gov/pmc/articles/8575008
PMID	34226915
PQID	2548913445
PQPubID	23479
ParticipantIDs	pubmedcentral_primary_oai_pubmedcentral_nih_gov_8575008 proquest_miscellaneous_2548913445 pubmed_primary_34226915 crossref_primary_10_1093_bib_bbab245 crossref_citationtrail_10_1093_bib_bbab245
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2021-11-05
PublicationDateYYYYMMDD	2021-11-05
PublicationDate_xml	– month: 11 year: 2021 text: 2021-11-05 day: 05
PublicationDecade	2020
PublicationPlace	England
PublicationPlace_xml	– name: England
PublicationTitle	Briefings in bioinformatics
PublicationTitleAlternate	Brief Bioinform
PublicationYear	2021
Publisher	Oxford University Press
Publisher_xml	– name: Oxford University Press
References	Sun (2021110815080481400_ref21) 2016; 44 Li (2021110815080481400_ref36) 2020; 18 Song (2021110815080481400_ref19) 2020; 16 Friedman (2021110815080481400_ref32) 2001 Tang (2021110815080481400_ref22) 2021; 49 Li (2021110815080481400_ref29) 2021; 22 Snoek (2021110815080481400_ref33) 2012 Liu (2021110815080481400_ref37) 2021; 22 Ma (2021110815080481400_ref6) 2003; 22 Carlile (2021110815080481400_ref7) 2014; 515 Jack (2021110815080481400_ref5) 2011; 44 Charette (2021110815080481400_ref2) 2000; 49 Lundberg (2021110815080481400_ref41) 2017 Li (2021110815080481400_ref10) 2015; 31 Basak (2021110815080481400_ref4) 2014; 8 Lv (2021110815080481400_ref16) 2020; 8 Su (2021110815080481400_ref26) 2021; 21 Chen (2021110815080481400_ref20) 2019; 47 Chen (2021110815080481400_ref23) 2020; 21 Song (2021110815080481400_ref18) 2020; 11 Li (2021110815080481400_ref39) 2015; 31 Bi (2021110815080481400_ref42) 2020; 22 Chen (2021110815080481400_ref11) 2016; 5 Chen (2021110815080481400_ref24) 2021 Ge (2021110815080481400_ref1) 2013; 38 Verma (2021110815080481400_ref27) 2017 Li (2021110815080481400_ref40) 2016; 6 Mei (2021110815080481400_ref8) 2012; 31 Khan (2021110815080481400_ref17) 2020; 18 Freund (2021110815080481400_ref30) 1996 Li (2021110815080481400_ref9) 2015; 11 Bi (2021110815080481400_ref15) 2020; 8 Liu (2021110815080481400_ref14) 2020; 295 Raschka (2021110815080481400_ref34) 2018; 3 He (2021110815080481400_ref12) 2018; 19 Davis (2021110815080481400_ref3) 1998; 15 Tahir (2021110815080481400_ref13) 2019; 16 Chen (2021110815080481400_ref31) 2015 Li (2021110815080481400_ref35) 2020; 36 Wei (2021110815080481400_ref28) 2020 Peng (2021110815080481400_ref38) 2005; 27 Mishra (2021110815080481400_ref25) 2019; 35
References_xml	– volume: 47 start-page: e41 year: 2019 ident: 2021110815080481400_ref20 article-title: WHISTLE: a high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach publication-title: Nucleic Acids Res doi: 10.1093/nar/gkz074 – volume: 21 start-page: 996 year: 2021 ident: 2021110815080481400_ref26 article-title: Meta-GDBP: a high-level stacked regression model to improve anticancer drug response prediction publication-title: Brief Bioinform doi: 10.1093/bib/bbz022 – volume: 18 start-page: 1877 year: 2020 ident: 2021110815080481400_ref17 article-title: MU-PseUDeep: a deep learning method for prediction of pseudouridine sites publication-title: Comput Struct Biotechnol J doi: 10.1016/j.csbj.2020.07.010 – volume: 19 start-page: 306 year: 2018 ident: 2021110815080481400_ref12 article-title: PseUI: pseudouridine sites identification based on RNA sequence information publication-title: BMC Bioinformatics doi: 10.1186/s12859-018-2321-0 – year: 2021 ident: 2021110815080481400_ref24 article-title: iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization publication-title: Nucleic Acids Res doi: 10.1093/nar/gkab122 – volume-title: Xgboost: Extreme Gradient Boosting year: 2015 ident: 2021110815080481400_ref31 – volume: 8 start-page: 966 year: 2014 ident: 2021110815080481400_ref4 article-title: A pseudouridine residue in the spliceosome core is part of the filamentous growth program in yeast publication-title: Cell Rep doi: 10.1016/j.celrep.2014.07.004 – volume: 6 start-page: 34595 year: 2016 ident: 2021110815080481400_ref40 article-title: GlycoMine(struct): a new bioinformatics tool for highly accurate mapping of the human N-linked and O-linked glycoproteomes by incorporating structural features publication-title: Sci Rep doi: 10.1038/srep34595 – volume: 11 start-page: 592 year: 2015 ident: 2021110815080481400_ref9 article-title: Chemical pulldown reveals dynamic pseudouridylation of the mammalian transcriptome publication-title: Nat Chem Biol doi: 10.1038/nchembio.1836 – volume: 44 start-page: D259 year: 2016 ident: 2021110815080481400_ref21 article-title: RMBase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data publication-title: Nucleic Acids Res doi: 10.1093/nar/gkv1036 – volume: 15 start-page: 1121 year: 1998 ident: 2021110815080481400_ref3 article-title: An RNA model system for investigation of pseudouridine stabilization of the codon-anticodon interaction in tRNALys, tRNAHis and tRNATyr publication-title: J Biomol Struct Dyn doi: 10.1080/07391102.1998.10509006 – start-page: 1189 year: 2001 ident: 2021110815080481400_ref32 article-title: Greedy function approximation: a gradient boosting machine publication-title: Ann Stat – volume: 515 start-page: 143 year: 2014 ident: 2021110815080481400_ref7 article-title: Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells publication-title: Nature doi: 10.1038/nature13802 – volume: 3 start-page: 638 year: 2018 ident: 2021110815080481400_ref34 article-title: MLxtend: providing machine learning and data science utilities and extensions to Python's scientific computing stack publication-title: J Open Source Software doi: 10.21105/joss.00638 – volume: 31 start-page: 2794 year: 2012 ident: 2021110815080481400_ref8 article-title: Small nucleolar RNA 42 acts as an oncogene in lung tumorigenesis publication-title: Oncogene doi: 10.1038/onc.2011.449 – volume: 5 start-page: e332 year: 2016 ident: 2021110815080481400_ref11 article-title: iRNA-PseU: identifying RNA pseudouridine sites publication-title: Mol Ther Nucleic Acids – volume: 8 start-page: 79376 year: 2020 ident: 2021110815080481400_ref15 article-title: EnsemPseU: identifying pseudouridine sites with an ensemble approach publication-title: IEEE Access doi: 10.1109/ACCESS.2020.2989469 – volume: 8 start-page: 134 year: 2020 ident: 2021110815080481400_ref16 article-title: RF-PseU: a random forest predictor for RNA pseudouridine sites publication-title: Front Bioeng Biotechnol doi: 10.3389/fbioe.2020.00134 – volume: 38 start-page: 210 year: 2013 ident: 2021110815080481400_ref1 article-title: RNA pseudouridylation: new insights into an old modification publication-title: Trends Biochem Sci doi: 10.1016/j.tibs.2013.01.002 – volume: 27 start-page: 1226 year: 2005 ident: 2021110815080481400_ref38 article-title: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy publication-title: IEEE Trans Pattern Anal Mach Intell doi: 10.1109/TPAMI.2005.159 – volume: 16 start-page: 1176934320925752 year: 2020 ident: 2021110815080481400_ref19 article-title: PSI-MOUSE: predicting mouse pseudouridine sites from sequence and genome-derived features publication-title: Evol Bioinformatics Online – volume: 49 start-page: D134 year: 2021 ident: 2021110815080481400_ref22 article-title: m6A-Atlas: a comprehensive knowledgebase for unraveling the N6-methyladenosine (m6A) epitranscriptome publication-title: Nucleic Acids Res doi: 10.1093/nar/gkaa692 – volume: 31 start-page: 1411 year: 2015 ident: 2021110815080481400_ref39 article-title: GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu852 – volume: 22 start-page: 1889 year: 2003 ident: 2021110815080481400_ref6 article-title: Pseudouridylation (Ψ) of U2 snRNA in S. cerevisiae is catalyzed by an RNA-independent mechanism publication-title: EMBO J doi: 10.1093/emboj/cdg191 – volume: 22 start-page: 362 year: 2020 ident: 2021110815080481400_ref42 article-title: An interpretable prediction model for identifying N(7)-methylguanosine sites based on XGBoost and SHAP publication-title: Mol Ther Nucleic Acids doi: 10.1016/j.omtn.2020.08.022 – volume: 295 start-page: 13 year: 2020 ident: 2021110815080481400_ref14 article-title: XG-PseU: an eXtreme Gradient Boosting based method for identifying pseudouridine sites publication-title: Mol Gen Genomics doi: 10.1007/s00438-019-01600-9 – year: 2020 ident: 2021110815080481400_ref28 article-title: Computational prediction and interpretation of cell-specific replication origin sites from multiple eukaryotes by exploiting stacking framework publication-title: Brief Bioinform doi: 10.1093/bib/bbaa275 – volume: 35 start-page: 433 year: 2019 ident: 2021110815080481400_ref25 article-title: StackDPPred: a stacking based prediction of DNA-binding protein from sequence publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty653 – year: 2012 ident: 2021110815080481400_ref33 article-title: Practical bayesian optimization of machine learning algorithms – volume: 31 start-page: 3362 year: 2015 ident: 2021110815080481400_ref10 article-title: a web server to predict PUS-specific pseudouridine sites publication-title: Bioinformatics doi: 10.1093/bioinformatics/btv366 – volume: 21 start-page: 1047 year: 2020 ident: 2021110815080481400_ref23 article-title: iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data publication-title: Brief Bioinform doi: 10.1093/bib/bbz041 – start-page: 148 volume-title: ICML year: 1996 ident: 2021110815080481400_ref30 – volume: 36 start-page: 1057 year: 2020 ident: 2021110815080481400_ref35 article-title: DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites publication-title: Bioinformatics doi: 10.1093/bioinformatics/btz721 – start-page: 155 volume-title: Proceedings of the 7th International Conference on Cloud Computing Data Science and Engineering (Confluence 2017) year: 2017 ident: 2021110815080481400_ref27 doi: 10.1109/CONFLUENCE.2017.7943141 – start-page: 4765 volume-title: Advances in Neural Information Processing Systems year: 2017 ident: 2021110815080481400_ref41 – volume: 44 start-page: 660 year: 2011 ident: 2021110815080481400_ref5 article-title: rRNA pseudouridylation defects affect ribosomal ligand binding and translational fidelity from yeast to human cells publication-title: Mol Cell doi: 10.1016/j.molcel.2011.09.017 – volume: 16 start-page: 463 year: 2019 ident: 2021110815080481400_ref13 article-title: iPseU-CNN: identifying RNA pseudouridine sites using convolutional neural networks publication-title: Mol Ther Nucleic Acids doi: 10.1016/j.omtn.2019.03.010 – volume: 18 start-page: 52 year: 2020 ident: 2021110815080481400_ref36 article-title: Procleave: predicting protease-specific substrate cleavage sites by combining sequence and structural information publication-title: Genomics Proteomics Bioinformatics doi: 10.1016/j.gpb.2019.08.002 – volume: 22 start-page: 2126 year: 2021 ident: 2021110815080481400_ref29 article-title: Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework publication-title: Brief Bioinform doi: 10.1093/bib/bbaa049 – volume: 11 start-page: 88 year: 2020 ident: 2021110815080481400_ref18 article-title: PIANO: a web server for pseudouridine-site (Psi) identification and functional annotation publication-title: Front Genet doi: 10.3389/fgene.2020.00088 – volume: 22 year: 2021 ident: 2021110815080481400_ref37 article-title: DeepTorrent: a deep learning-based approach for predicting DNA N4-methylcytosine sites publication-title: Brief Bioinform doi: 10.1093/bib/bbaa124 – volume: 49 start-page: 341 year: 2000 ident: 2021110815080481400_ref2 article-title: Pseudouridine in RNA: what, where, how, and why publication-title: IUBMB Life doi: 10.1080/152165400410182
SSID	ssj0020781
Score	2.5157695
Snippet	Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all...
SourceID	pubmedcentral proquest pubmed crossref
SourceType	Open Access Repository Aggregation Database Index Database Enrichment Source
SubjectTerms	Algorithms Computational Biology - methods Machine Learning Problem Solving Protocol Pseudouridine - chemistry Pseudouridine - genetics Reproducibility of Results RNA - chemistry RNA - genetics Sequence Analysis, RNA - methods
Title	Porpoise: a new approach for accurate prediction of RNA pseudouridine sites
URI	https://www.ncbi.nlm.nih.gov/pubmed/34226915 https://www.proquest.com/docview/2548913445 https://pubmed.ncbi.nlm.nih.gov/PMC8575008
Volume	22
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fb9MwELbQEGgviN90_JCReGIKa5w4TnibKqYJ2KhQJ_Utsh0HIkFarYnE-Ou5s-M0GX2AvViV69qR76tzl9z3HSFvtEiQeBYH4KvyII6VCGQikiCbyrTQpTEiRILz2XlyehF_XPLlNnXIsksa9U7_3skruYlVoQ_siizZ_7BsPyl0wGewL7RgYWj_ycZzlCGuHL1cYnHwXiLcJUdq3aISBAoBFJX2vuHX8-PD9ca0BT5_L9DLxDfIm9HrXQigS1vQs6oPVbXq5FWbQWr8Z5sGcNJeVX0KT2sfuy5h4u52iJk5TqNgbqq16UfOPCWkqn8BPL8NHz2w0HLw-DZQ3U1pHJymeAqj3r672XR9QkDQ6pSj_RHM2ABqyc6T3aleqUphq6RiToVyYOX1T2vmCLnBmeOIXpPSnp_NsCCpJYffZhBXYMmLxZdlH6Gj8pGjo7nr7gidsPYRrHzUrbtP7vpFxt7MXyHK9UzbgeuyuE_udTEHPXYAekBumfohueOqkF49Ip88jN5TSQFE1IOIwsZTDyK6BRFdlRRAREcgohZEj8nFyYfF7DToamwEGny1JtBwkwkFeEphqcRUCQkhpkjiLMxMEjGZ6lJKlnKm4yzWMc8UhOdSTJlhWiuViugJ2atXtXlGqM6mvCwZV1kkY85TCdOoKGQaglaleTEhb_1e5boToMc6KD9ylwgR5bDHebfHEzhG_OC1013ZPey13_QczkV82SVrs2o3OYNQHLNKcMxTZ4R-Im-9CREj8_QDUHN9_E1dfbfa6x1-Dm78y-dkf_tXekH2msvWvAS_tlGvLBb_AFoSqLc
linkProvider	Oxford University Press
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Porpoise%3A+a+new+approach+for+accurate+prediction+of+RNA+pseudouridine+sites&rft.jtitle=Briefings+in+bioinformatics&rft.au=Li%2C+Fuyi&rft.au=Guo%2C+Xudong&rft.au=Jin%2C+Peipei&rft.au=Chen%2C+Jinxiang&rft.date=2021-11-05&rft.pub=Oxford+University+Press&rft.issn=1467-5463&rft.eissn=1477-4054&rft.volume=22&rft.issue=6&rft_id=info:doi/10.1093%2Fbib%2Fbbab245&rft_id=info%3Apmid%2F34226915&rft.externalDocID=PMC8575008
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1467-5463&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1467-5463&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1467-5463&client=summon