Porpoise: a new approach for accurate prediction of RNA pseudouridine sites

Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental appro...

Full description

Saved in:
Bibliographic Details
Published inBriefings in bioinformatics Vol. 22; no. 6
Main Authors Li, Fuyi, Guo, Xudong, Jin, Peipei, Chen, Jinxiang, Xiang, Dongxu, Song, Jiangning, Coin, Lachlan J M
Format Journal Article
LanguageEnglish
Published England Oxford University Press 05.11.2021
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.
AbstractList Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.
Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k-tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.
Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all kinds of RNAs are subject to this modification. However, it remains a great challenge to identify pseudouridine sites via experimental approaches, requiring expensive and time-consuming experimental research. Therefore, computational approaches that can be used to perform accurate in silico identification of pseudouridine sites from the large amount of RNA sequence data are highly desirable and can aid in the functional elucidation of this critical modification. Here, we propose a new computational approach, termed Porpoise, to accurately identify pseudouridine sites from RNA sequence data. Porpoise builds upon a comprehensive evaluation of 18 frequently used feature encoding schemes based on the selection of four types of features, including binary features, pseudo k -tuple composition, nucleotide chemical property and position-specific trinucleotide propensity based on single-strand (PSTNPss). The selected features are fed into the stacked ensemble learning framework to enable the construction of an effective stacked model. Both cross-validation tests on the benchmark dataset and independent tests show that Porpoise achieves superior predictive performance than several state-of-the-art approaches. The application of model interpretation tools demonstrates the importance of PSTNPs for the performance of the trained models. This new method is anticipated to facilitate community-wide efforts to identify putative pseudouridine sites and formulate novel testable biological hypothesis.
Author Li, Fuyi
Jin, Peipei
Song, Jiangning
Xiang, Dongxu
Guo, Xudong
Chen, Jinxiang
Coin, Lachlan J M
Author_xml – sequence: 1
  givenname: Fuyi
  surname: Li
  fullname: Li, Fuyi
  organization: Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, the University of Melbourne, Australia
– sequence: 2
  givenname: Xudong
  surname: Guo
  fullname: Guo, Xudong
  organization: Ningxia University, China
– sequence: 3
  givenname: Peipei
  surname: Jin
  fullname: Jin, Peipei
  organization: Department of Clinical Laboratory of Ruijin Hospital, affiliated with Shanghai Jiao Tong University School of Medicine, Shanghai, China
– sequence: 4
  givenname: Jinxiang
  surname: Chen
  fullname: Chen, Jinxiang
  organization: Northwest A&F University, China
– sequence: 5
  givenname: Dongxu
  surname: Xiang
  fullname: Xiang, Dongxu
  organization: Faculty of Engineering and Information Technology, The University of Melbourne, Australia
– sequence: 6
  givenname: Jiangning
  surname: Song
  fullname: Song, Jiangning
  organization: Monash Biomedicine Discovery Institute, Monash University, Australia
– sequence: 7
  givenname: Lachlan J M
  surname: Coin
  fullname: Coin, Lachlan J M
  organization: Department of Microbiology and Immunology at the University of Melbourne, Australia
BackLink https://www.ncbi.nlm.nih.gov/pubmed/34226915$$D View this record in MEDLINE/PubMed
BookMark eNptkd1LHTEQxUNR6kd96rvkUZCtSTbZJH0QRGwrFS2iz2GSO6spezdrsmvxv3cv3korPs3A_ObMcM4O2ehTj4R85uwLZ7Y-8tEfeQ9eSPWBbHOpdSWZkhurvtGVkk29RXZK-c2YYNrwj2SrlkI0lqtt8vNXykOKBb9SoD3-oTAMOUG4p23KFEKYMoxIh4yLGMaYeppaen15QoeC0yJNOS5ij7TEEcsnstlCV3BvXXfJ7bezm9Mf1cXV9_PTk4sq1IaPVTBGcx0Qees18xqEUrqRlltsagEmtADCKBGklUEq6601oJlAEYL3Rte75PhFd5j8EhcB-zFD54Ycl5CfXILo_p_08d7dpUdnlFaMmVngYC2Q08OEZXTLWAJ2HfSYpuKEksbyWko1o_v_3no98tfBGTh8AUJOpWRsXxHO3CofN-fj1vnMNH9DhzjCytf50di9u_MM-YeVtA
CitedBy_id crossref_primary_10_1109_TCBB_2024_3389094
crossref_primary_10_3390_ijms25052869
crossref_primary_10_1093_bib_bbae169
crossref_primary_10_1080_07391102_2024_2318482
crossref_primary_10_31083_j_fbl2709269
crossref_primary_10_1007_s12539_022_00520_4
crossref_primary_10_1109_TCBB_2024_3467093
crossref_primary_10_1016_j_jbc_2024_107140
crossref_primary_10_1109_TCBB_2023_3283985
crossref_primary_10_1093_bib_bbad372
crossref_primary_10_1016_j_csbj_2024_04_052
crossref_primary_10_1093_bib_bbad209
crossref_primary_10_1038_s41587_024_02135_0
crossref_primary_10_1016_j_jmb_2022_167604
crossref_primary_10_1016_j_isci_2022_104883
crossref_primary_10_1016_j_ymthe_2022_05_001
crossref_primary_10_1186_s12859_024_05744_3
crossref_primary_10_1093_bib_bbad083
crossref_primary_10_3389_fncel_2022_1058083
crossref_primary_10_3390_math11030602
crossref_primary_10_1016_j_csbj_2022_01_019
crossref_primary_10_1186_s13059_021_02557_y
crossref_primary_10_1016_j_chemolab_2023_104847
crossref_primary_10_1093_bib_bbab461
crossref_primary_10_1093_bib_bbac031
crossref_primary_10_1371_journal_pone_0290538
crossref_primary_10_3390_genes13040677
crossref_primary_10_1016_j_compbiomed_2022_106368
crossref_primary_10_1109_TCBB_2021_3136905
crossref_primary_10_1093_bib_bbac467
crossref_primary_10_1016_j_ijbiomac_2023_124228
crossref_primary_10_1016_j_compbiomed_2023_107155
crossref_primary_10_1016_j_compbiomed_2023_107032
crossref_primary_10_1109_TCBB_2024_3418490
crossref_primary_10_1093_nar_gkae782
crossref_primary_10_12677_HJBM_2022_122014
crossref_primary_10_3389_fgene_2023_1121694
crossref_primary_10_3934_mbe_2022644
crossref_primary_10_1016_j_ymeth_2022_03_001
Cites_doi 10.1093/nar/gkz074
10.1093/bib/bbz022
10.1016/j.csbj.2020.07.010
10.1186/s12859-018-2321-0
10.1093/nar/gkab122
10.1016/j.celrep.2014.07.004
10.1038/srep34595
10.1038/nchembio.1836
10.1093/nar/gkv1036
10.1080/07391102.1998.10509006
10.1038/nature13802
10.21105/joss.00638
10.1038/onc.2011.449
10.1109/ACCESS.2020.2989469
10.3389/fbioe.2020.00134
10.1016/j.tibs.2013.01.002
10.1109/TPAMI.2005.159
10.1093/nar/gkaa692
10.1093/bioinformatics/btu852
10.1093/emboj/cdg191
10.1016/j.omtn.2020.08.022
10.1007/s00438-019-01600-9
10.1093/bib/bbaa275
10.1093/bioinformatics/bty653
10.1093/bioinformatics/btv366
10.1093/bib/bbz041
10.1093/bioinformatics/btz721
10.1109/CONFLUENCE.2017.7943141
10.1016/j.molcel.2011.09.017
10.1016/j.omtn.2019.03.010
10.1016/j.gpb.2019.08.002
10.1093/bib/bbaa049
10.3389/fgene.2020.00088
10.1093/bib/bbaa124
10.1080/152165400410182
ContentType Journal Article
Copyright The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2021
Copyright_xml – notice: The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
– notice: The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2021
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
5PM
DOI 10.1093/bib/bbab245
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE
MEDLINE - Academic

CrossRef
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1477-4054
ExternalDocumentID PMC8575008
34226915
10_1093_bib_bbab245
Genre Research Support, Non-U.S. Gov't
Journal Article
Research Support, N.I.H., Extramural
GrantInformation_xml – fundername: NIAID NIH HHS
  grantid: R01 AI111965
– fundername: ;
– fundername: ;
  grantid: R01 AI111965
– fundername: ;
  grantid: GNT1195743
– fundername: ;
  grantid: LP110200333; DP120104460
– fundername: ;
  grantid: APP1127948; APP1144652
– fundername: ;
  grantid: APP1103384
GroupedDBID ---
-E4
.2P
.I3
0R~
23N
2WC
36B
4.4
48X
53G
5GY
5VS
6J9
70D
8VB
AAHBH
AAIJN
AAIMJ
AAJKP
AAMDB
AAMVS
AAOGV
AAPQZ
AAPXW
AARHZ
AAVAP
AAVLN
AAYXX
ABDBF
ABEJV
ABEUO
ABGNP
ABIXL
ABNKS
ABPQP
ABPTD
ABQLI
ABWST
ABXVV
ABXZS
ABZBJ
ACGFO
ACGFS
ACGOD
ACIWK
ACPRK
ACUFI
ACUHS
ACUXJ
ACYTK
ADBBV
ADEYI
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADOCK
ADPDF
ADQBN
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEGXH
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AEMOZ
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQXC
AGSYK
AHGBF
AHMBA
AHQJS
AHXPO
AIAGR
AIJHB
AJEEA
AJEUX
AKHUL
AKVCP
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
ALXQX
AMNDL
ANAKG
APIBT
APWMN
ARIXL
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BEYMZ
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C45
CDBKE
CITATION
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
E3Z
EAD
EAP
EAS
EBA
EBC
EBD
EBR
EBS
EBU
EE~
EMB
EMK
EMOBN
EST
ESX
F5P
F9B
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GX1
H13
H5~
HAR
HW0
HZ~
IOX
J21
JXSIZ
K1G
KBUDW
KOP
KSI
KSN
M-Z
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
Q5Y
QWB
RD5
RPM
RUSNO
RW1
RXO
SV3
TEORI
TH9
TJP
TLC
TOX
TR2
TUS
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
ZL0
~91
ADRIX
AFXEN
BCRHZ
CGR
CUY
CVF
ECM
EIF
GROUPED_DOAJ
M49
NPM
ROX
7X8
5PM
ID FETCH-LOGICAL-c381t-c88717cee1fb70b7a255764919e632a8cfaa2852c494c459b998a702e2ccbb873
ISSN 1467-5463
1477-4054
IngestDate Thu Aug 21 18:29:46 EDT 2025
Fri Jul 11 07:29:01 EDT 2025
Wed Feb 19 02:27:39 EST 2025
Thu Apr 24 22:59:18 EDT 2025
Tue Jul 01 03:39:35 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords ebioinformatics
RNA pseudouridine sit
sequence analysis
machine learning
stacking ensemble learning
Language English
License https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
The Author(s) 2021. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c381t-c88717cee1fb70b7a255764919e632a8cfaa2852c494c459b998a702e2ccbb873
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Fuyi Li and Xudong Guo authors contributed equally to this work
OpenAccessLink https://www.ncbi.nlm.nih.gov/pmc/articles/8575008
PMID 34226915
PQID 2548913445
PQPubID 23479
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_8575008
proquest_miscellaneous_2548913445
pubmed_primary_34226915
crossref_primary_10_1093_bib_bbab245
crossref_citationtrail_10_1093_bib_bbab245
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2021-11-05
PublicationDateYYYYMMDD 2021-11-05
PublicationDate_xml – month: 11
  year: 2021
  text: 2021-11-05
  day: 05
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
PublicationTitle Briefings in bioinformatics
PublicationTitleAlternate Brief Bioinform
PublicationYear 2021
Publisher Oxford University Press
Publisher_xml – name: Oxford University Press
References Sun (2021110815080481400_ref21) 2016; 44
Li (2021110815080481400_ref36) 2020; 18
Song (2021110815080481400_ref19) 2020; 16
Friedman (2021110815080481400_ref32) 2001
Tang (2021110815080481400_ref22) 2021; 49
Li (2021110815080481400_ref29) 2021; 22
Snoek (2021110815080481400_ref33) 2012
Liu (2021110815080481400_ref37) 2021; 22
Ma (2021110815080481400_ref6) 2003; 22
Carlile (2021110815080481400_ref7) 2014; 515
Jack (2021110815080481400_ref5) 2011; 44
Charette (2021110815080481400_ref2) 2000; 49
Lundberg (2021110815080481400_ref41) 2017
Li (2021110815080481400_ref10) 2015; 31
Basak (2021110815080481400_ref4) 2014; 8
Lv (2021110815080481400_ref16) 2020; 8
Su (2021110815080481400_ref26) 2021; 21
Chen (2021110815080481400_ref20) 2019; 47
Chen (2021110815080481400_ref23) 2020; 21
Song (2021110815080481400_ref18) 2020; 11
Li (2021110815080481400_ref39) 2015; 31
Bi (2021110815080481400_ref42) 2020; 22
Chen (2021110815080481400_ref11) 2016; 5
Chen (2021110815080481400_ref24) 2021
Ge (2021110815080481400_ref1) 2013; 38
Verma (2021110815080481400_ref27) 2017
Li (2021110815080481400_ref40) 2016; 6
Mei (2021110815080481400_ref8) 2012; 31
Khan (2021110815080481400_ref17) 2020; 18
Freund (2021110815080481400_ref30) 1996
Li (2021110815080481400_ref9) 2015; 11
Bi (2021110815080481400_ref15) 2020; 8
Liu (2021110815080481400_ref14) 2020; 295
Raschka (2021110815080481400_ref34) 2018; 3
He (2021110815080481400_ref12) 2018; 19
Davis (2021110815080481400_ref3) 1998; 15
Tahir (2021110815080481400_ref13) 2019; 16
Chen (2021110815080481400_ref31) 2015
Li (2021110815080481400_ref35) 2020; 36
Wei (2021110815080481400_ref28) 2020
Peng (2021110815080481400_ref38) 2005; 27
Mishra (2021110815080481400_ref25) 2019; 35
References_xml – volume: 47
  start-page: e41
  year: 2019
  ident: 2021110815080481400_ref20
  article-title: WHISTLE: a high-accuracy map of the human N6-methyladenosine (m6A) epitranscriptome predicted using a machine learning approach
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkz074
– volume: 21
  start-page: 996
  year: 2021
  ident: 2021110815080481400_ref26
  article-title: Meta-GDBP: a high-level stacked regression model to improve anticancer drug response prediction
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbz022
– volume: 18
  start-page: 1877
  year: 2020
  ident: 2021110815080481400_ref17
  article-title: MU-PseUDeep: a deep learning method for prediction of pseudouridine sites
  publication-title: Comput Struct Biotechnol J
  doi: 10.1016/j.csbj.2020.07.010
– volume: 19
  start-page: 306
  year: 2018
  ident: 2021110815080481400_ref12
  article-title: PseUI: pseudouridine sites identification based on RNA sequence information
  publication-title: BMC Bioinformatics
  doi: 10.1186/s12859-018-2321-0
– year: 2021
  ident: 2021110815080481400_ref24
  article-title: iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkab122
– volume-title: Xgboost: Extreme Gradient Boosting
  year: 2015
  ident: 2021110815080481400_ref31
– volume: 8
  start-page: 966
  year: 2014
  ident: 2021110815080481400_ref4
  article-title: A pseudouridine residue in the spliceosome core is part of the filamentous growth program in yeast
  publication-title: Cell Rep
  doi: 10.1016/j.celrep.2014.07.004
– volume: 6
  start-page: 34595
  year: 2016
  ident: 2021110815080481400_ref40
  article-title: GlycoMine(struct): a new bioinformatics tool for highly accurate mapping of the human N-linked and O-linked glycoproteomes by incorporating structural features
  publication-title: Sci Rep
  doi: 10.1038/srep34595
– volume: 11
  start-page: 592
  year: 2015
  ident: 2021110815080481400_ref9
  article-title: Chemical pulldown reveals dynamic pseudouridylation of the mammalian transcriptome
  publication-title: Nat Chem Biol
  doi: 10.1038/nchembio.1836
– volume: 44
  start-page: D259
  year: 2016
  ident: 2021110815080481400_ref21
  article-title: RMBase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkv1036
– volume: 15
  start-page: 1121
  year: 1998
  ident: 2021110815080481400_ref3
  article-title: An RNA model system for investigation of pseudouridine stabilization of the codon-anticodon interaction in tRNALys, tRNAHis and tRNATyr
  publication-title: J Biomol Struct Dyn
  doi: 10.1080/07391102.1998.10509006
– start-page: 1189
  year: 2001
  ident: 2021110815080481400_ref32
  article-title: Greedy function approximation: a gradient boosting machine
  publication-title: Ann Stat
– volume: 515
  start-page: 143
  year: 2014
  ident: 2021110815080481400_ref7
  article-title: Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells
  publication-title: Nature
  doi: 10.1038/nature13802
– volume: 3
  start-page: 638
  year: 2018
  ident: 2021110815080481400_ref34
  article-title: MLxtend: providing machine learning and data science utilities and extensions to Python's scientific computing stack
  publication-title: J Open Source Software
  doi: 10.21105/joss.00638
– volume: 31
  start-page: 2794
  year: 2012
  ident: 2021110815080481400_ref8
  article-title: Small nucleolar RNA 42 acts as an oncogene in lung tumorigenesis
  publication-title: Oncogene
  doi: 10.1038/onc.2011.449
– volume: 5
  start-page: e332
  year: 2016
  ident: 2021110815080481400_ref11
  article-title: iRNA-PseU: identifying RNA pseudouridine sites
  publication-title: Mol Ther Nucleic Acids
– volume: 8
  start-page: 79376
  year: 2020
  ident: 2021110815080481400_ref15
  article-title: EnsemPseU: identifying pseudouridine sites with an ensemble approach
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2020.2989469
– volume: 8
  start-page: 134
  year: 2020
  ident: 2021110815080481400_ref16
  article-title: RF-PseU: a random forest predictor for RNA pseudouridine sites
  publication-title: Front Bioeng Biotechnol
  doi: 10.3389/fbioe.2020.00134
– volume: 38
  start-page: 210
  year: 2013
  ident: 2021110815080481400_ref1
  article-title: RNA pseudouridylation: new insights into an old modification
  publication-title: Trends Biochem Sci
  doi: 10.1016/j.tibs.2013.01.002
– volume: 27
  start-page: 1226
  year: 2005
  ident: 2021110815080481400_ref38
  article-title: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy
  publication-title: IEEE Trans Pattern Anal Mach Intell
  doi: 10.1109/TPAMI.2005.159
– volume: 16
  start-page: 1176934320925752
  year: 2020
  ident: 2021110815080481400_ref19
  article-title: PSI-MOUSE: predicting mouse pseudouridine sites from sequence and genome-derived features
  publication-title: Evol Bioinformatics Online
– volume: 49
  start-page: D134
  year: 2021
  ident: 2021110815080481400_ref22
  article-title: m6A-Atlas: a comprehensive knowledgebase for unraveling the N6-methyladenosine (m6A) epitranscriptome
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkaa692
– volume: 31
  start-page: 1411
  year: 2015
  ident: 2021110815080481400_ref39
  article-title: GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btu852
– volume: 22
  start-page: 1889
  year: 2003
  ident: 2021110815080481400_ref6
  article-title: Pseudouridylation (Ψ) of U2 snRNA in S. cerevisiae is catalyzed by an RNA-independent mechanism
  publication-title: EMBO J
  doi: 10.1093/emboj/cdg191
– volume: 22
  start-page: 362
  year: 2020
  ident: 2021110815080481400_ref42
  article-title: An interpretable prediction model for identifying N(7)-methylguanosine sites based on XGBoost and SHAP
  publication-title: Mol Ther Nucleic Acids
  doi: 10.1016/j.omtn.2020.08.022
– volume: 295
  start-page: 13
  year: 2020
  ident: 2021110815080481400_ref14
  article-title: XG-PseU: an eXtreme Gradient Boosting based method for identifying pseudouridine sites
  publication-title: Mol Gen Genomics
  doi: 10.1007/s00438-019-01600-9
– year: 2020
  ident: 2021110815080481400_ref28
  article-title: Computational prediction and interpretation of cell-specific replication origin sites from multiple eukaryotes by exploiting stacking framework
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbaa275
– volume: 35
  start-page: 433
  year: 2019
  ident: 2021110815080481400_ref25
  article-title: StackDPPred: a stacking based prediction of DNA-binding protein from sequence
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bty653
– year: 2012
  ident: 2021110815080481400_ref33
  article-title: Practical bayesian optimization of machine learning algorithms
– volume: 31
  start-page: 3362
  year: 2015
  ident: 2021110815080481400_ref10
  article-title: a web server to predict PUS-specific pseudouridine sites
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btv366
– volume: 21
  start-page: 1047
  year: 2020
  ident: 2021110815080481400_ref23
  article-title: iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbz041
– start-page: 148
  volume-title: ICML
  year: 1996
  ident: 2021110815080481400_ref30
– volume: 36
  start-page: 1057
  year: 2020
  ident: 2021110815080481400_ref35
  article-title: DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btz721
– start-page: 155
  volume-title: Proceedings of the 7th International Conference on Cloud Computing Data Science and Engineering (Confluence 2017)
  year: 2017
  ident: 2021110815080481400_ref27
  doi: 10.1109/CONFLUENCE.2017.7943141
– start-page: 4765
  volume-title: Advances in Neural Information Processing Systems
  year: 2017
  ident: 2021110815080481400_ref41
– volume: 44
  start-page: 660
  year: 2011
  ident: 2021110815080481400_ref5
  article-title: rRNA pseudouridylation defects affect ribosomal ligand binding and translational fidelity from yeast to human cells
  publication-title: Mol Cell
  doi: 10.1016/j.molcel.2011.09.017
– volume: 16
  start-page: 463
  year: 2019
  ident: 2021110815080481400_ref13
  article-title: iPseU-CNN: identifying RNA pseudouridine sites using convolutional neural networks
  publication-title: Mol Ther Nucleic Acids
  doi: 10.1016/j.omtn.2019.03.010
– volume: 18
  start-page: 52
  year: 2020
  ident: 2021110815080481400_ref36
  article-title: Procleave: predicting protease-specific substrate cleavage sites by combining sequence and structural information
  publication-title: Genomics Proteomics Bioinformatics
  doi: 10.1016/j.gpb.2019.08.002
– volume: 22
  start-page: 2126
  year: 2021
  ident: 2021110815080481400_ref29
  article-title: Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbaa049
– volume: 11
  start-page: 88
  year: 2020
  ident: 2021110815080481400_ref18
  article-title: PIANO: a web server for pseudouridine-site (Psi) identification and functional annotation
  publication-title: Front Genet
  doi: 10.3389/fgene.2020.00088
– volume: 22
  year: 2021
  ident: 2021110815080481400_ref37
  article-title: DeepTorrent: a deep learning-based approach for predicting DNA N4-methylcytosine sites
  publication-title: Brief Bioinform
  doi: 10.1093/bib/bbaa124
– volume: 49
  start-page: 341
  year: 2000
  ident: 2021110815080481400_ref2
  article-title: Pseudouridine in RNA: what, where, how, and why
  publication-title: IUBMB Life
  doi: 10.1080/152165400410182
SSID ssj0020781
Score 2.5157695
Snippet Pseudouridine is a ubiquitous RNA modification type present in eukaryotes and prokaryotes, which plays a vital role in various biological processes. Almost all...
SourceID pubmedcentral
proquest
pubmed
crossref
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
SubjectTerms Algorithms
Computational Biology - methods
Machine Learning
Problem Solving Protocol
Pseudouridine - chemistry
Pseudouridine - genetics
Reproducibility of Results
RNA - chemistry
RNA - genetics
Sequence Analysis, RNA - methods
Title Porpoise: a new approach for accurate prediction of RNA pseudouridine sites
URI https://www.ncbi.nlm.nih.gov/pubmed/34226915
https://www.proquest.com/docview/2548913445
https://pubmed.ncbi.nlm.nih.gov/PMC8575008
Volume 22
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3fb9MwELbQEGgviN90_JCReGIKa5w4TnibKqYJ2KhQJ_Utsh0HIkFarYnE-Ou5s-M0GX2AvViV69qR76tzl9z3HSFvtEiQeBYH4KvyII6VCGQikiCbyrTQpTEiRILz2XlyehF_XPLlNnXIsksa9U7_3skruYlVoQ_siizZ_7BsPyl0wGewL7RgYWj_ycZzlCGuHL1cYnHwXiLcJUdq3aISBAoBFJX2vuHX8-PD9ca0BT5_L9DLxDfIm9HrXQigS1vQs6oPVbXq5FWbQWr8Z5sGcNJeVX0KT2sfuy5h4u52iJk5TqNgbqq16UfOPCWkqn8BPL8NHz2w0HLw-DZQ3U1pHJymeAqj3r672XR9QkDQ6pSj_RHM2ABqyc6T3aleqUphq6RiToVyYOX1T2vmCLnBmeOIXpPSnp_NsCCpJYffZhBXYMmLxZdlH6Gj8pGjo7nr7gidsPYRrHzUrbtP7vpFxt7MXyHK9UzbgeuyuE_udTEHPXYAekBumfohueOqkF49Ip88jN5TSQFE1IOIwsZTDyK6BRFdlRRAREcgohZEj8nFyYfF7DToamwEGny1JtBwkwkFeEphqcRUCQkhpkjiLMxMEjGZ6lJKlnKm4yzWMc8UhOdSTJlhWiuViugJ2atXtXlGqM6mvCwZV1kkY85TCdOoKGQaglaleTEhb_1e5boToMc6KD9ylwgR5bDHebfHEzhG_OC1013ZPey13_QczkV82SVrs2o3OYNQHLNKcMxTZ4R-Im-9CREj8_QDUHN9_E1dfbfa6x1-Dm78y-dkf_tXekH2msvWvAS_tlGvLBb_AFoSqLc
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Porpoise%3A+a+new+approach+for+accurate+prediction+of+RNA+pseudouridine+sites&rft.jtitle=Briefings+in+bioinformatics&rft.au=Li%2C+Fuyi&rft.au=Guo%2C+Xudong&rft.au=Jin%2C+Peipei&rft.au=Chen%2C+Jinxiang&rft.date=2021-11-05&rft.pub=Oxford+University+Press&rft.issn=1467-5463&rft.eissn=1477-4054&rft.volume=22&rft.issue=6&rft_id=info:doi/10.1093%2Fbib%2Fbbab245&rft_id=info%3Apmid%2F34226915&rft.externalDocID=PMC8575008
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1467-5463&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1467-5463&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1467-5463&client=summon