A simple guide to de novo transcriptome assembly and annotation
Abstract A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be...
Saved in:
Published in | Briefings in bioinformatics Vol. 23; no. 2 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
England
Oxford University Press
10.03.2022
Oxford Publishing Limited (England) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Abstract
A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools. |
---|---|
AbstractList | A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools.A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools. Abstract A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools. A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools. A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is unnecessary, expensive or difficult. In the absence of a sequenced genome to guide the reconstruction process, the transcriptome must be assembled de novo using only the information available in the RNA-seq reads. Subsequently, the sequences must be annotated in order to identify sequence-intrinsic and evolutionary features in them (for example, protein-coding regions). Although straightforward at first glance, de novo transcriptome assembly and annotation can quickly prove to be challenging undertakings. In addition to familiarizing themselves with the conceptual and technical intricacies of the tasks at hand and the numerous pre- and post-processing steps involved, those interested must also grapple with an overwhelmingly large choice of tools. The lack of standardized workflows, fast pace of development of new tools and techniques and paucity of authoritative literature have served to exacerbate the difficulty of the task even further. Here, we present a comprehensive overview of de novo transcriptome assembly and annotation. We discuss the procedures involved, including pre- and post-processing steps, and present a compendium of corresponding tools. |
Author | Kraft, Louis Mesny, Fantin Rigerte, Linda Raghavan, Venket |
Author_xml | – sequence: 1 givenname: Venket surname: Raghavan fullname: Raghavan, Venket email: vraghav@mpibpc.mpg.de – sequence: 2 givenname: Louis orcidid: 0000-0002-6465-4973 surname: Kraft fullname: Kraft, Louis email: louis.kraft@mpibpc.mpg.de – sequence: 3 givenname: Fantin surname: Mesny fullname: Mesny, Fantin – sequence: 4 givenname: Linda surname: Rigerte fullname: Rigerte, Linda |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/35076693$$D View this record in MEDLINE/PubMed |
BookMark | eNp9kctL5jAUxYMovlfuh4IgA1JNmjSPzQwioyMIbnQd8rifE2mT2rSC_72R71NUZBZ5QH733JN7dtB6TBEQOiD4hGBFT22wp9Ya23K6hrYJE6JmuGXrr3cu6pZxuoV2cn7AuMFCkk20RVssOFd0G_0-q3Lohw6q-zl4qKZUlT2mp1RNo4nZjWGYUg-VyRl62z1XJvqyYprMFFLcQxsL02XYX5276O7iz-353_r65vLq_Oy6dozhqXa-tFaOGAaKQgPgHMZKYit5Y8miccqDZ94JYRSVIKRsvcJAvATbKgN0F_1a6g6z7cE7iMVep4cx9GZ81skE_fklhn_6Pj1pqRrCKS4CP1cCY3qcIU-6D9lB15kIac664U3DOW6lKujhF_QhzWMs3ysUw1xIwdtC_fjo6N3K22wLQJaAG1POIyy0C8uhFYOh0wTr1_x0yU-v8is1x19q3mS_p4-WdJqH_4IvXWiq6w |
CitedBy_id | crossref_primary_10_1177_11779322241274957 crossref_primary_10_1186_s12859_023_05245_9 crossref_primary_10_1016_j_cofs_2023_101039 crossref_primary_10_1186_s12859_023_05614_4 crossref_primary_10_1016_j_toxicon_2023_107556 crossref_primary_10_3390_nu17050792 crossref_primary_10_1016_j_chemosphere_2024_142948 crossref_primary_10_1093_nargab_lqad007 crossref_primary_10_1186_s12983_024_00538_y crossref_primary_10_1038_s41598_022_27199_3 crossref_primary_10_1093_bib_bbae313 crossref_primary_10_1093_nargab_lqad089 crossref_primary_10_32604_phyton_2023_046943 crossref_primary_10_3389_fgene_2024_1361418 crossref_primary_10_1002_ggn2_202200024 crossref_primary_10_3390_ijms26052373 crossref_primary_10_2174_1574893618666230222122054 crossref_primary_10_1016_j_compbiolchem_2024_108028 crossref_primary_10_3390_insects16030243 crossref_primary_10_48130_vegres_0024_0031 crossref_primary_10_1002_cpz1_70016 crossref_primary_10_1093_g3journal_jkae234 crossref_primary_10_1038_s41467_023_38785_y crossref_primary_10_1186_s13007_024_01255_7 crossref_primary_10_1016_j_scitotenv_2024_175968 crossref_primary_10_1093_mollus_eyad001 crossref_primary_10_32604_phyton_2025_059598 crossref_primary_10_3390_jof9080790 crossref_primary_10_1038_s41597_025_04496_w crossref_primary_10_3389_fpls_2022_1072765 crossref_primary_10_1002_cpz1_1054 crossref_primary_10_1093_bfgp_elae033 crossref_primary_10_1111_mec_16866 crossref_primary_10_1038_s41597_025_04393_2 crossref_primary_10_1111_mec_17550 crossref_primary_10_1007_s13337_024_00859_w crossref_primary_10_1016_j_mex_2023_102449 crossref_primary_10_1016_j_bbagrm_2024_195058 crossref_primary_10_1016_j_jgg_2024_03_004 crossref_primary_10_1186_s13059_023_03141_2 crossref_primary_10_1093_bioadv_vbae152 crossref_primary_10_3390_life12111939 crossref_primary_10_1016_j_cbd_2023_101177 crossref_primary_10_1002_qub2_78 crossref_primary_10_15324_kjcls_2023_55_4_235 crossref_primary_10_3390_genes15121547 crossref_primary_10_3390_ijms241612712 crossref_primary_10_1093_nar_gkae833 crossref_primary_10_3390_molecules28186654 crossref_primary_10_1016_j_hermed_2024_100899 crossref_primary_10_1186_s12859_024_05887_3 crossref_primary_10_1111_ppl_13788 crossref_primary_10_1111_mec_17382 crossref_primary_10_3390_biology12070997 crossref_primary_10_1093_database_baaf019 crossref_primary_10_2174_1574893618666230707103956 crossref_primary_10_1016_j_jhip_2024_06_003 crossref_primary_10_14712_fb2023069030099 crossref_primary_10_3390_cells13221898 crossref_primary_10_1007_s10725_024_01125_1 crossref_primary_10_3390_cimb46080520 |
Cites_doi | 10.1093/bioinformatics/btw218 10.12659/MSMBR.892101 10.1186/s12864-017-4379-x 10.1038/nmeth.1923 10.1186/gb-2013-14-12-r134 10.1093/bib/bbw020 10.1186/2047-217X-2-9 10.1093/gigascience/giz084 10.1186/s13059-015-0865-0 10.1093/nar/gkt006 10.1371/journal.pcbi.1008622 10.1101/gr.124321.111 10.1038/s41592-018-0046-7 10.1093/bib/bby067 10.1038/s41467-019-11272-z 10.1111/1755-0998.12933 10.1093/bioinformatics/bty669 10.1186/s12864-018-4869-5 10.3389/fgene.2019.00496 10.1093/gigascience/giaa163 10.1093/bioinformatics/bts565 10.1093/nar/gky350 10.1093/bioinformatics/bts611 10.1093/bioinformatics/btu077 10.1093/bioinformatics/btp616 10.1021/cr400585q 10.1371/journal.pcbi.1002195 10.3390/genes12030352 10.1261/rna.053959.115 10.1101/pdb.top084970 10.1371/journal.pone.0158565 10.1093/nar/gkv007 10.1186/1471-2105-11-119 10.1093/bib/bbv099 10.1093/bib/bbaa045 10.1093/bioinformatics/btab184 10.12688/f1000research.6924.1 10.1016/j.cbpc.2011.05.012 10.1038/s41598-019-41502-9 10.1371/journal.pcbi.1002514 10.1016/j.celrep.2016.12.063 10.1007/978-1-4939-3743-1_5 10.1186/s13059-019-1891-0 10.1073/pnas.1806447115 10.1093/bioinformatics/btl158 10.1038/nmeth.4197 10.1038/nprot.2013.084 10.3390/ijms21051720 10.1038/s41392-021-00486-7 10.1186/s40659-017-0114-y 10.1101/gr.260174.119 10.1101/2021.02.18.431773 10.12688/f1000research.29032.2 10.1101/gr.196469.115 10.3390/genes12070953 10.1111/1755-0998.13285 10.1186/s13059-019-1832-y 10.1002/wrna.1364 10.1038/ng0506-500 10.1038/s41598-019-42560-9 10.12688/f1000research.17351.1 10.1007/978-1-4939-9074-0_24 10.1093/nar/gkaa1113 10.1371/journal.pcbi.1000160 10.5195/jmla.2018.512 10.1093/bioinformatics/btt219 10.1109/TCBB.2018.2808350 10.1038/nrg2484 10.1007/978-3-319-16480-9_51 10.1186/1471-2105-10-421 10.1146/annurev-biodatasci-072018-021255 10.1002/pmic.201700071 10.1186/s12864-017-4002-1 10.1093/bioinformatics/bty560 10.1186/s13059-014-0550-8 10.1007/978-1-4939-9074-0_5 10.1101/2021.04.12.439551 10.1093/nar/gkq1019 10.1093/bioinformatics/bty895 10.1093/nar/gkv1189 10.1093/nar/gkz268 10.1186/s12862-021-01772-2 10.14806/ej.17.1.200 10.1093/bfgp/elu035 10.1093/bioinformatics/bts635 10.1093/bioinformatics/bty1057 10.1038/s41598-020-57961-4 10.1016/j.jmb.2015.11.006 10.1007/978-1-62703-646-7_5 10.1093/bioinformatics/bts091 10.1038/srep33964 10.1093/nar/28.1.27 10.1006/jmbi.2000.4315 10.1111/1755-0998.13106 10.1201/9781420011807 10.1126/science.1162986 10.14806/ej.23.0.897 10.1093/bioinformatics/btw354 10.1186/s12859-017-1906-3 10.1038/nbt.2931 10.1186/1471-2105-12-S10-S5 10.7717/peerj.8206 10.1093/nar/gkaa1079 10.1186/s13059-020-02227-5 10.1371/journal.pone.0157022 10.1093/bioinformatics/btu031 10.1038/s41576-019-0150-2 10.1093/nar/gky379 10.1038/s41580-020-00315-9 10.3390/insects12010067 10.1186/1471-2164-12-444 10.1002/cpmb.59 10.1093/bioinformatics/btu739 10.1038/s41592-021-01101-x 10.1093/molbev/msx148 10.1038/nature25458 10.1038/nbt.3820 10.1093/gigascience/giz039 10.1093/bioinformatics/btp352 10.1371/journal.pbio.0000057 10.1038/s41587-020-0439-x 10.1093/nar/gkx1069 10.1093/nar/gkaa1009 10.1093/nar/gkab565 10.1093/nar/gkm160 10.1038/s41598-017-01617-3 10.1186/s12859-014-0357-3 10.1186/s13040-016-0095-3 10.1093/bfgp/elt016 10.1007/s40484-018-0144-7 10.1093/bioinformatics/btw405 10.1038/nrg2934 10.1186/s13059-019-1690-7 10.1093/nar/gkaa1026 10.1038/s41587-019-0036-z 10.1093/bioinformatics/bty378 10.1186/s12864-021-07563-9 10.1002/ece3.5571 10.1016/j.gde.2011.04.001 10.1038/s41592-019-0437-4 10.1186/s13059-014-0553-5 10.1371/journal.pone.0069401 10.1371/journal.pcbi.1004772 10.1002/0471250953.bi0301s42 10.1093/bioinformatics/btu170 10.7717/peerj.5428 10.1093/nar/gkz991 10.1098/rstb.2019.0097 10.1093/bioinformatics/bts094 10.1101/gr.243212.118 10.21105/joss.02959 10.12688/f1000research.21142.1 10.1093/bioinformatics/btx198 10.1186/s13059-015-0596-2 10.1186/gb-2010-11-12-220 10.1186/1471-2105-12-323 10.1111/1462-2920.12174 10.1038/nbt.3519 10.1093/gigascience/giaa140 10.1101/733311 10.1038/nmeth.1517 10.1093/nar/gkaa1047 10.3389/fgene.2019.00317 10.1371/journal.pone.0185056 10.1093/bioinformatics/btu033 10.1093/nar/gky1085 10.1093/nar/gkx428 10.1186/s13742-015-0089-y 10.1038/s41576-020-0258-4 10.1093/molbev/mst010 10.3390/md18080392 10.1093/nar/gkx1095 10.1093/bioinformatics/btm098 10.1101/gr.210641.116 10.1093/bioinformatics/btw231 10.1038/nbt.1883 10.1007/s00778-005-0153-9 10.1002/bit.27467 10.1038/d41586-019-02619-z 10.1093/gigascience/giz100 10.1101/gr.8.3.186 10.1093/nar/gkaa1100 10.1038/nrg3863 10.1186/s13059-019-1715-2 10.1155/2015/862130 10.1038/nrm.2017.77 10.1093/bioinformatics/btt509 10.1186/s13059-016-0881-8 10.1186/s12864-020-6528-x 10.1093/nar/gkn176 10.1111/mec.13526 10.1186/gb4161 10.1093/bioinformatics/bty896 10.1038/nrg3068 10.1186/s13059-020-1935-5 10.1371/journal.pone.0163962 10.1186/s12864-019-6432-4 10.1038/s41597-019-0350-9 10.1111/1755-0998.13156 10.1093/nar/gkx1002 10.1007/978-1-4757-3783-7 10.1038/nsmb0207-103 10.1093/nar/gkaa1007 10.1093/bioinformatics/14.9.755 10.1093/bioinformatics/bts480 10.1016/S0022-2836(05)80360-2 10.1016/j.bpj.2015.12.041 10.1002/pro.3715 10.1186/s12859-020-03565-8 10.1111/1755-0998.12324 10.1073/pnas.84.13.4355 10.1007/978-1-4939-2291-8_8 10.1093/nar/gkq224 10.1093/nar/gkaa913 10.1371/journal.pone.0042882 10.1007/978-1-4939-9173-0_14 10.1038/nature11247 10.1371/journal.pgen.1003569 10.1186/s12859-017-1724-7 10.1093/nar/gkaa970 10.1126/science.1138341 10.1038/75556 10.1093/bioinformatics/btv106 10.1093/nar/gkv227 10.1038/nbt.3988 10.1093/bioinformatics/18.suppl_1.S181 10.1016/j.drudis.2019.03.030 10.1038/35057062 10.1186/s12859-019-3272-9 10.1038/s41467-018-04964-5 |
ContentType | Journal Article |
Copyright | The Author(s) 2022. Published by Oxford University Press. 2022 The Author(s) 2022. Published by Oxford University Press. |
Copyright_xml | – notice: The Author(s) 2022. Published by Oxford University Press. 2022 – notice: The Author(s) 2022. Published by Oxford University Press. |
DBID | TOX AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QO 7SC 8FD FR3 JQ2 K9. L7M L~C L~D P64 RC3 7X8 5PM |
DOI | 10.1093/bib/bbab563 |
DatabaseName | Oxford Journals Open Access Collection CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Biotechnology Research Abstracts Computer and Information Systems Abstracts Technology Research Database Engineering Research Database ProQuest Computer Science Collection ProQuest Health & Medical Complete (Alumni) Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Biotechnology and BioEngineering Abstracts Genetics Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) |
DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Genetics Abstracts Biotechnology Research Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Health & Medical Complete (Alumni) Engineering Research Database Advanced Technologies Database with Aerospace Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts Professional MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic MEDLINE Genetics Abstracts CrossRef |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: TOX name: Oxford Journals Open Access Collection url: https://academic.oup.com/journals/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 1477-4054 |
ExternalDocumentID | PMC8921630 35076693 10_1093_bib_bbab563 10.1093/bib/bbab563 |
Genre | Journal Article Review |
GroupedDBID | --- -E4 .2P .I3 0R~ 1TH 23N 2WC 36B 4.4 48X 53G 5GY 5VS 6J9 70D 8VB AAGQS AAHBH AAIJN AAIMJ AAJKP AAJQQ AAMDB AAMVS AAOGV AAPQZ AAPXW AARHZ AAUQX AAVAP AAVLN ABDBF ABEJV ABEUO ABGNP ABIXL ABNKS ABPQP ABPTD ABQLI ABQTQ ABWST ABXVV ABXZS ABZBJ ACGFO ACGFS ACGOD ACIWK ACPRK ACUFI ACUHS ACUXJ ACYTK ADBBV ADEYI ADFTL ADGKP ADGZP ADHKW ADHZD ADOCK ADPDF ADQBN ADRDM ADRTK ADVEK ADYVW ADZTZ ADZXQ AECKG AEGPL AEGXH AEJOX AEKKA AEKSI AELWJ AEMDU AEMOZ AENEX AENZO AEPUE AETBJ AEWNT AFFZL AFGWE AFIYH AFOFC AFRAH AGINJ AGKEF AGQXC AGSYK AHMBA AHQJS AHXPO AIAGR AIJHB AJEEA AJEUX AKHUL AKVCP AKWXX ALMA_UNASSIGNED_HOLDINGS ALTZX ALUQC ALXQX AMNDL ANAKG APIBT APWMN ARIXL AXUDD AYOIW AZVOD BAWUL BAYMD BEYMZ BHONS BQDIO BQUQU BSWAC BTQHN C1A C45 CAG CDBKE COF CS3 CZ4 DAKXR DIK DILTD DU5 D~K E3Z EAD EAP EAS EBA EBC EBD EBR EBS EBU EE~ EJD EMB EMK EMOBN EST ESX F5P F9B FHSFR FLIZI FLUFQ FOEOM FQBLK GAUVT GJXCC GROUPED_DOAJ GX1 H13 H5~ HAR HW0 HZ~ IOX J21 JXSIZ K1G KBUDW KOP KSI KSN M-Z M49 MK~ ML0 N9A NGC NLBLG NMDNZ NOMLY NU- O0~ O9- OAWHX ODMLO OJQWA OK1 OVD OVEED P2P PAFKI PEELM PQQKQ Q1. Q5Y QWB RD5 RPM RUSNO RW1 RXO SV3 TEORI TH9 TJP TLC TOX TR2 TUS W8F WOQ X7H YAYTL YKOAZ YXANX ZKX ZL0 ~91 AAYXX AHGBF CITATION ADRIX AFXEN BCRHZ CGR CUY CVF ECM EIF NPM ROX 7QO 7SC 8FD FR3 JQ2 K9. L7M L~C L~D P64 RC3 7X8 5PM |
ID | FETCH-LOGICAL-c440t-cd2079c1a4e93e2eecc00980b862b1f2c9ded4dc77a938e7885d90e1d8eb59ae3 |
IEDL.DBID | TOX |
ISSN | 1467-5463 1477-4054 |
IngestDate | Thu Aug 21 18:11:29 EDT 2025 Thu Jul 10 18:09:30 EDT 2025 Mon Jun 30 08:52:24 EDT 2025 Wed Feb 19 02:26:53 EST 2025 Tue Jul 01 03:39:38 EDT 2025 Thu Apr 24 22:58:24 EDT 2025 Wed Apr 02 07:00:33 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2 |
Keywords | annotation assembly RNA-seq tools de novo transcriptome |
Language | English |
License | This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. https://creativecommons.org/licenses/by/4.0 The Author(s) 2022. Published by Oxford University Press. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c440t-cd2079c1a4e93e2eecc00980b862b1f2c9ded4dc77a938e7885d90e1d8eb59ae3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Review-3 content type line 23 Venket Raghavan Louis Kraft are joint first coauthors. Fantin Mesny and Linda Rigerte are joint second coauthors. |
ORCID | 0000-0002-6465-4973 |
OpenAccessLink | https://dx.doi.org/10.1093/bib/bbab563 |
PMID | 35076693 |
PQID | 2640678765 |
PQPubID | 26846 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_8921630 proquest_miscellaneous_2622660589 proquest_journals_2640678765 pubmed_primary_35076693 crossref_citationtrail_10_1093_bib_bbab563 crossref_primary_10_1093_bib_bbab563 oup_primary_10_1093_bib_bbab563 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2022-03-10 |
PublicationDateYYYYMMDD | 2022-03-10 |
PublicationDate_xml | – month: 03 year: 2022 text: 2022-03-10 day: 10 |
PublicationDecade | 2020 |
PublicationPlace | England |
PublicationPlace_xml | – name: England – name: Oxford |
PublicationTitle | Briefings in bioinformatics |
PublicationTitleAlternate | Brief Bioinform |
PublicationYear | 2022 |
Publisher | Oxford University Press Oxford Publishing Limited (England) |
Publisher_xml | – name: Oxford University Press – name: Oxford Publishing Limited (England) |
References | Ewels (2022031506250617800_ref229) 2020; 38 Mirdita (2022031506250617800_ref111) 2021; 37 Love (2022031506250617800_ref121) 2014; 15 Punta (2022031506250617800_ref157) 2008; 4 Schulz (2022031506250617800_ref64) 2012; 28 Huerta-Cepas (2022031506250617800_ref184) 2019; 47 Perkel (2022031506250617800_ref218) 2019; 573 Deorowicz (2022031506250617800_ref211) 2016; 6 Nachtigall (2022031506250617800_ref146) 2021; 22 Castrignanò (2022031506250617800_ref245) 2020; 21 Han (2022031506250617800_ref10) 2015; 9 Kukurba (2022031506250617800_ref12) 2015; 2015 International Human Genome Sequencing Consortium (2022031506250617800_ref76) 2001; 409 Beier (2022031506250617800_ref248) 2017; 33 Spillane (2022031506250617800_ref216) 2021; 21 Leipzig (2022031506250617800_ref217) 2016 Zhao (2022031506250617800_ref70) 2020; 17 Reich (2022031506250617800_ref239) 2006; 38 Bolger (2022031506250617800_ref34) 2014; 30 Mölder (2022031506250617800_ref221) 2021; 10 Larkin (2022031506250617800_ref165) 2021; 49 Varet (2022031506250617800_ref128) 2016; 11 Kanehisa (2022031506250617800_ref187) 2021; 49 McDermaid (2022031506250617800_ref124) 2019; 20 Cozzetto (2022031506250617800_ref196) 2017 Mühr (2022031506250617800_ref73) 2020; 15 European Organization for Nuclear Research and OpenAIRE (2022031506250617800_ref251) 2013 Wang (2022031506250617800_ref138) 2013; 41 Bucchini (2022031506250617800_ref203); 49 Wu (2022031506250617800_ref208) 2011; 12 Nowoshilow (2022031506250617800_ref103) 2018; 554 Statello (2022031506250617800_ref3) 2021; 22 Dohmen (2022031506250617800_ref79) 2016; 32 Razo-Mendivil (2022031506250617800_ref113) 2020; 21 McManus (2022031506250617800_ref60) 2011; 21 Vera-Khlara (2022031506250617800_ref130) 2021; 12 Eid (2022031506250617800_ref257) 2009; 323 Strozzi (2022031506250617800_ref220) 2019 Landau (2022031506250617800_ref232) 2021; 6 Morlan (2022031506250617800_ref39) 2012; 7 Zhang (2022031506250617800_ref99) 2017; 18 Guo (2022031506250617800_ref136) 2015 Sayadi (2022031506250617800_ref155) 2016; 11 Armenteros (2022031506250617800_ref173) 2019; 37 Törönen (2022031506250617800_ref209) 2018; 46 Steinegger (2022031506250617800_ref109) 2018; 9 Dobin (2022031506250617800_ref94) 2013; 29 Kapranov (2022031506250617800_ref142) 2007; 316 Sillitoe (2022031506250617800_ref178) 2021; 49 Li (2022031506250617800_ref107) 2006; 22 Pulido (2022031506250617800_ref206) 2021 Waardenberg (2022031506250617800_ref131) 2019; 7 Finotello (2022031506250617800_ref134) 2015; 14 Oshlack (2022031506250617800_ref115) 2010; 11 Liu (2022031506250617800_ref57) 2016; 12 Soneson (2022031506250617800_ref258) 2019; 10 Altenhoff (2022031506250617800_ref185) 1; 49 Love (2022031506250617800_ref126) 2017; 1 stackoverflow (2022031506250617800_ref226) 2020 Cavallaro (2022031506250617800_ref51) 2021; 22 Zhang (2022031506250617800_ref59) 2021; 6 Miller (2022031506250617800_ref215) 2019; 35 Tarazona (2022031506250617800_ref44) 2011; 21 Kalvari (2022031506250617800_ref41) 2021; 49 Jassal (2022031506250617800_ref190) 2020; 48 Gollery (2022031506250617800_ref169) 2008 Van Bel (2022031506250617800_ref202) 2013; 14 Mantione (2022031506250617800_ref9) 2014; 20 Slatko (2022031506250617800_ref6) 2018; 122 Harris (2022031506250617800_ref166) 2020; 48 Voss (2022031506250617800_ref231) 2017 Grüning (2022031506250617800_ref243) 2018; 15 NCBI Resource Coordinators (2022031506250617800_ref161) 2018; 46 Reiter (2022031506250617800_ref223) 2021; 10 Amarasinghe (2022031506250617800_ref255) 2020; 21 Zyprych-Walczak (2022031506250617800_ref116) 2015 Kashyap (2022031506250617800_ref145) 2020; 21 Altenhoff (2022031506250617800_ref207) 2019; 29 Carruthers (2022031506250617800_ref24) 2018; 19 McCorrison (2022031506250617800_ref49) 2014; 15 Amaral (2022031506250617800_ref143) 2013; 12 Milicchio (2022031506250617800_ref233) 2016; 9 Camacho (2022031506250617800_ref159) 2009; 10 Holoch (2022031506250617800_ref4) 2015; 16 Bushnell (2022031506250617800_ref31) 2017; 12 Crusoe (2022031506250617800_ref47) 2015; 4 Seppey (2022031506250617800_ref77) 2019 Casimiro-Soriguer (2022031506250617800_ref200) 2017; 17 Mikheyev (2022031506250617800_ref256) 2014; 14 Yu (2022031506250617800_ref66) 2013; 29 Byrne (2022031506250617800_ref254) 2019; 374 Li (2022031506250617800_ref95) 2009; 25 Kanehisa (2022031506250617800_ref189) 2000; 28 Lu (2022031506250617800_ref199) 2020; 48 Bushmanova (2022031506250617800_ref83) 2016; 32 Zhang (2022031506250617800_ref213) 2017 Bray (2022031506250617800_ref97) 2016; 34 Rosen (2022031506250617800_ref21) 2021; 12 Soderlund (2022031506250617800_ref205) 2013; 8 Davidson (2022031506250617800_ref114) 2017; 18 Harrison (2022031506250617800_ref174) 2017; 18 Smith-Unna (2022031506250617800_ref81) 2016; 26 Malik (2022031506250617800_ref112) 2018; 34 Altschul (2022031506250617800_ref158) 1990; 215 Tang (2022031506250617800_ref148) 2015; 43 Conery (2022031506250617800_ref219) 2005; 14 Emms (2022031506250617800_ref214) 2019; 20 Götz (2022031506250617800_ref186) 2008; 36 Garcia (2022031506250617800_ref26) 2012; 155 Michael (2022031506250617800_ref234) 2010; 11 Shahjaman (2022031506250617800_ref125) 2019; 8 UniProt Consortium (2022031506250617800_ref162) 2021; 49 Schaarschmidt (2022031506250617800_ref101) 2020; 21 Koonin (2022031506250617800_ref153) 2003 Shen (2022031506250617800_ref74) 2016; 11 Lewis (2022031506250617800_ref179) 2018; 46 Afgan (2022031506250617800_ref23) 2018; 46 Van Roey (2022031506250617800_ref175) 2014; 114 Katoh (2022031506250617800_ref210) 2013; 30 Everaert (2022031506250617800_ref100) 2017; 7 Wood (2022031506250617800_ref35) 2019; 20 The ENCODE Project Consortium (2022031506250617800_ref104) 2012; 489 Pearson (2022031506250617800_ref156) 2014 Dessimoz (2022031506250617800_ref182) 2016 Davidson (2022031506250617800_ref62) 2014; 15 Suzek (2022031506250617800_ref163) 2015; 31 Stephens (2022031506250617800_ref119) 2017; 18 Huerta-Cepas (2022031506250617800_ref183) 2017; 34 Kotliar (2022031506250617800_ref230) 2019; 8 Durai (2022031506250617800_ref50) 2019; 9 Ewels (2022031506250617800_ref28) 2016; 32 Steinegger (2022031506250617800_ref108) 2017; 35 Van den Berge (2022031506250617800_ref132) 2019; 2 Musacchia (2022031506250617800_ref198) 2015; 31 Stark (2022031506250617800_ref7) 2019; 20 Nawrocki (2022031506250617800_ref139) 2013; 29 Köster (2022031506250617800_ref227) 2012; 28 Peréz-Sánchez (2022031506250617800_ref247) 2015 Zhu (2022031506250617800_ref118) 2019; 35 Vandepoele (2022031506250617800_ref168) 2013; 15 Limin (2022031506250617800_ref86) 2012; 28 Li (2022031506250617800_ref5) 2019; 10 Ewing (2022031506250617800_ref32) 1998; 8 Amstutz (2022031506250617800_ref225) 2016 Bryant (2022031506250617800_ref75) 2017; 18 Schimmel (2022031506250617800_ref2) 2018; 19 (2022031506250617800_ref167) 2018; 46 Ceschin (2022031506250617800_ref84) 2020; 10 Hyatt (2022031506250617800_ref147) 2010; 11 Blankenberg (2022031506250617800_ref236) 2014; 15 Stamatakis (2022031506250617800_ref212) 2014; 30 Struhl (2022031506250617800_ref52) 2007; 14 Kang (2022031506250617800_ref137) 2017; 45 Suzek (2022031506250617800_ref164) 2007; 23 Quast (2022031506250617800_ref42) 2013; 41 Hansen (2022031506250617800_ref53) 2010; 38 Smith-Unna (2022031506250617800_ref80) 2016; 26 R Core Team (2022031506250617800_ref120) 2021 Krogh (2022031506250617800_ref193) 2001; 305 Martin (2022031506250617800_ref30) 2011; 17 Pearson (2022031506250617800_ref154) 2013; Chapter 3 Rivera-Vicéns (2022031506250617800_ref89) 2021 Buchfink (2022031506250617800_ref160) 2021; 18 Eddy (2022031506250617800_ref151) 2011; 7 Kim (2022031506250617800_ref36) 2016; 26 Volden (2022031506250617800_ref259) 2018; 115 Martin (2022031506250617800_ref15) 2011; 12 Peona (2022031506250617800_ref16) 2018; 18 Li (2022031506250617800_ref38) 2015 Ozsolak (2022031506250617800_ref54) 2011; 12 O’Leary (2022031506250617800_ref140) 2016; 44 Wu (2022031506250617800_ref102) 2018; 19 Zhao (2022031506250617800_ref68) 2019; 20 Chabikwa (2022031506250617800_ref20) 2020; 7 Wedemeyer (2022031506250617800_ref48) 2017; 18 Ortiz (2022031506250617800_ref90) 2021; 12 Hölzer (2022031506250617800_ref58) 2019; 8 Buccitelli (2022031506250617800_ref1) 2020; 21 Hrdlickova (2022031506250617800_ref14) 2017; 8 Mirdita (2022031506250617800_ref110) 2019; 35 Li (2022031506250617800_ref96) 2011; 12 Salzberg (2022031506250617800_ref13) 2019; 20 Steinegger (2022031506250617800_ref152) 2019; 16 Patro (2022031506250617800_ref98) 2017; 14 MacManes (2022031506250617800_ref88) 2018; 6 Risso (2022031506250617800_ref127) 2014; 32 Okonechnikov (2022031506250617800_ref238) 2012; 28 Li (2022031506250617800_ref82) 2014; 15 Conesa (2022031506250617800_ref91) 2016; 17 Todd (2022031506250617800_ref17) 2016; 25 Wang (2022031506250617800_ref8) 2009; 10 Thunders (2022031506250617800_ref253) 2017; 50 Leinonen (2022031506250617800_ref250) 2011; 39 Grabherr (2022031506250617800_ref46) 2011; 29 Jones (2022031506250617800_ref176) 2014; 30 Altenhoff (2022031506250617800_ref195) 2012; 8 Moreno-Santillán (2022031506250617800_ref19) 2019; 9 Soderlund (2022031506250617800_ref204) 2019 Sena Brandine (2022031506250617800_ref27) 2019; 8 Kanehisa (2022031506250617800_ref191) 2016; 428 Kerkvliet (2022031506250617800_ref85) 2019; 9 Song (2022031506250617800_ref29) 2015; 4 Cabau (2022031506250617800_ref87) 2017; 5 Chang (2022031506250617800_ref71) 2015; 16 Zhao (2022031506250617800_ref106) 2019; 24 Bushmanova (2022031506250617800_ref56) 2019; 8 Wu (2022031506250617800_ref129) 2016; 32 Di Tommaso (2022031506250617800_ref224) 2017; 35 Mistry (2022031506250617800_ref177) 2021; 49 Ritchie (2022031506250617800_ref123) 2015; 43 Chen (2022031506250617800_ref33) 2018; 34 Bryant (2022031506250617800_ref192) 2017; 18 Zdobnov (2022031506250617800_ref78) 2021; 49 Canzar (2022031506250617800_ref55) 2016; 17 Robinson (2022031506250617800_ref122) 2010; 26 Wang (2022031506250617800_ref43) 2011; 12 Liu (2022031506250617800_ref72) 2019; 20 Gene Ontology Consortium (2022031506250617800_ref180) 2021; 49 Wilfinger (2022031506250617800_ref117) 2021; 22 Ashburner (2022031506250617800_ref181) 2000; 25 Kanehisa (2022031506250617800_ref188) 2019; 28 Zhao (2022031506250617800_ref37) 2018; 8 Schurch (2022031506250617800_ref133) 2016; 22 Lagesen (2022031506250617800_ref14 |
References_xml | – volume: 32 start-page: 2210 issue: 14 year: 2016 ident: 2022031506250617800_ref83 article-title: rnaQUAST: a quality assessment tool forde novotranscriptome assemblies: table 1 publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw218 – volume: 20 start-page: 138 year: 2014 ident: 2022031506250617800_ref9 article-title: Comparing bioinformatic gene expression profiling methods: microarray and RNA-Seq publication-title: Med Sci Monit Basic Res doi: 10.12659/MSMBR.892101 – volume: 19 start-page: 32 issue: 1 year: 2018 ident: 2022031506250617800_ref24 article-title: De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species publication-title: BMC Genomics doi: 10.1186/s12864-017-4379-x – volume: 9 start-page: 357 issue: 4 year: 2012 ident: 2022031506250617800_ref93 article-title: Fast gapped-read alignment with bowtie 2 publication-title: Nat Methods doi: 10.1038/nmeth.1923 – volume: 14 issue: 12 year: 2013 ident: 2022031506250617800_ref202 article-title: TRAPID: an efficient online tool for the functional and comparative analysis of de novo RNA-Seq transcriptomes publication-title: Genome Biol doi: 10.1186/gb-2013-14-12-r134 – year: 2016 ident: 2022031506250617800_ref217 article-title: A review of bioinformatic pipeline frameworks publication-title: Brief Bioinform doi: 10.1093/bib/bbw020 – volume: 2 start-page: 9 issue: 1 year: 2013 ident: 2022031506250617800_ref246 article-title: Lessons learned from implementing a national infrastructure in Sweden for storage and analysis of next-generation sequencing data publication-title: Gigascience doi: 10.1186/2047-217X-2-9 – volume: 8 issue: 7 year: 2019 ident: 2022031506250617800_ref230 article-title: CWL-airflow: a lightweight pipeline manager supporting common workflow language publication-title: Gigascience doi: 10.1093/gigascience/giz084 – volume: 17 start-page: 16 issue: 1 year: 2016 ident: 2022031506250617800_ref55 article-title: CIDANE: comprehensive isoform discovery and abundance estimation publication-title: Genome Biol doi: 10.1186/s13059-015-0865-0 – volume: 41 start-page: e74 issue: 6 year: 2013 ident: 2022031506250617800_ref138 article-title: CPAT: coding-potential assessment tool using an alignment-free logistic regression model publication-title: Nucleic Acids Res doi: 10.1093/nar/gkt006 – volume: 17 issue: 2 year: 2021 ident: 2022031506250617800_ref222 article-title: Using prototyping to choose a bioinformatics workflow management system publication-title: PLoS Comput Biol doi: 10.1371/journal.pcbi.1008622 – volume: 21 start-page: 2213 issue: 12 year: 2011 ident: 2022031506250617800_ref44 article-title: Differential expression in RNA-seq: a matter of depth publication-title: Genome Res doi: 10.1101/gr.124321.111 – volume: 15 start-page: 475 issue: 7 year: 2018 ident: 2022031506250617800_ref243 article-title: Bioconda: sustainable and comprehensive software distribution for the life sciences publication-title: Nat Methods doi: 10.1038/s41592-018-0046-7 – volume: 20 start-page: 2044 issue: 6 year: 2019 ident: 2022031506250617800_ref124 article-title: Interpretation of differential gene expression results of RNA-seq data: review and integration publication-title: Brief Bioinform doi: 10.1093/bib/bby067 – volume: 10 issue: 1 year: 2019 ident: 2022031506250617800_ref258 article-title: A comprehensive examination of nanopore native RNA sequencing for characterization of complex transcriptomes publication-title: Nat Commun doi: 10.1038/s41467-019-11272-z – volume: 18 start-page: 1188 issue: 6 year: 2018 ident: 2022031506250617800_ref16 article-title: How complete are “complete” genome assemblies?-an avian perspective publication-title: Mol Ecol Resour doi: 10.1111/1755-0998.12933 – volume: 35 year: 2019 ident: 2022031506250617800_ref215 article-title: Justorthologs: a fast, accurate and user-friendly ortholog identification algorithm publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty669 – volume-title: Linux in Easy Steps year: 2010 ident: 2022031506250617800_ref241 – volume: 19 issue: 1 year: 2018 ident: 2022031506250617800_ref102 article-title: Limitations of alignment-free tools in total RNA-seq quantification publication-title: BMC Genomics doi: 10.1186/s12864-018-4869-5 – volume: 10 start-page: 496 year: 2019 ident: 2022031506250617800_ref5 article-title: Coding or noncoding, the converging concepts of RNAs publication-title: Front Genet doi: 10.3389/fgene.2019.00496 – volume: 10 issue: 2 year: 2021 ident: 2022031506250617800_ref22 article-title: Transcriptome annotation in the cloud: complexity, best practices, and cost publication-title: Gigascience doi: 10.1093/gigascience/giaa163 – volume-title: Zenodo year: 2013 ident: 2022031506250617800_ref251 – volume: 28 start-page: 3150 issue: 23 year: 2012 ident: 2022031506250617800_ref86 article-title: CD-HIT: accelerated for clustering the next-generation sequencing data publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts565 – volume: 46 start-page: W84 issue: W1 year: 2018 ident: 2022031506250617800_ref209 article-title: PANNZER2: a rapid functional annotation web server publication-title: Nucleic Acids Res doi: 10.1093/nar/gky350 – volume: 28 start-page: 3211 issue: 24 year: 2012 ident: 2022031506250617800_ref40 article-title: SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts611 – volume: 30 start-page: 1660 issue: 12 year: 2014 ident: 2022031506250617800_ref63 article-title: SOAPdenovo-trans: de novo transcriptome assembly with short RNA-Seq reads publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu077 – volume: 26 year: 2010 ident: 2022031506250617800_ref122 article-title: Edger: a bioconductor package for differential expression analysis of digital gene expression data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp616 – volume: 114 start-page: 6733 issue: 13 year: 2014 ident: 2022031506250617800_ref175 article-title: Short linear motifs: ubiquitous and functionally diverse protein interaction modules directing cell regulation publication-title: Chem Rev doi: 10.1021/cr400585q – volume: 7 issue: 10 year: 2011 ident: 2022031506250617800_ref151 article-title: Accelerated profile HMM searches publication-title: PLoS Comput Biol doi: 10.1371/journal.pcbi.1002195 – volume: 12 start-page: 352 issue: 3 year: 2021 ident: 2022031506250617800_ref130 article-title: Temporal dynamic methods for bulk RNA-Seq time series data publication-title: Genes (Basel) doi: 10.3390/genes12030352 – year: 2017 ident: 2022031506250617800_ref231 article-title: Full-stack genomics pipelining with GATK4 + WDL + Cromwell publication-title: ISCB Community Journal – volume: 22 start-page: 839 issue: 6 year: 2016 ident: 2022031506250617800_ref133 article-title: How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use? publication-title: RNA doi: 10.1261/rna.053959.115 – volume: 2015 start-page: 951 issue: 11 year: 2015 ident: 2022031506250617800_ref12 article-title: RNA sequencing and analysis publication-title: Cold Spring Harb Protoc doi: 10.1101/pdb.top084970 – volume: 11 issue: 7 year: 2016 ident: 2022031506250617800_ref155 article-title: The de novo transcriptome and its functional annotation in the seed beetle callosobruchus maculatus publication-title: PLoS One doi: 10.1371/journal.pone.0158565 – volume: 43 year: 2015 ident: 2022031506250617800_ref123 article-title: Limma powers differential expression analyses for rna-sequencing and microarray studies publication-title: Nucleic Acids Res doi: 10.1093/nar/gkv007 – volume: 11 start-page: 119 issue: 1 year: 2010 ident: 2022031506250617800_ref147 article-title: Prodigal: prokaryotic gene recognition and translation initiation site identification publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-11-119 – volume: 17 start-page: 1009 issue: 6 year: 2016 ident: 2022031506250617800_ref172 article-title: Multiple sequence alignment modeling: methods and applications publication-title: Brief Bioinform doi: 10.1093/bib/bbv099 – volume: 22 issue: 3 year: 2021 ident: 2022031506250617800_ref146 article-title: CodAn: predictive models for precise identification of coding regions in eukaryotic transcripts publication-title: Brief Bioinform doi: 10.1093/bib/bbaa045 – volume: 37 start-page: 3029 issue: 18 year: 2021 ident: 2022031506250617800_ref111 article-title: Fast and sensitive taxonomic assignment to metagenomic contigs publication-title: Bioinformatics doi: 10.1093/bioinformatics/btab184 – volume: 4 start-page: 900 year: 2015 ident: 2022031506250617800_ref47 article-title: The khmer software package: enabling efficient nucleotide sequence analysis publication-title: F1000Res doi: 10.12688/f1000research.6924.1 – volume: 155 start-page: 95 issue: 1 year: 2012 ident: 2022031506250617800_ref26 article-title: Effects of short read quality and quantity on a de novo vertebrate transcriptome assembly publication-title: Comp Biochem Physiol C Toxicol Pharmacol doi: 10.1016/j.cbpc.2011.05.012 – volume: 9 start-page: 5133 issue: 1 year: 2019 ident: 2022031506250617800_ref50 article-title: Improving in-silico normalization using read weights publication-title: Sci Rep doi: 10.1038/s41598-019-41502-9 – start-page: 621690 year: 2015 ident: 2022031506250617800_ref116 article-title: The impact of normalization methods on RNA-seq data analysis publication-title: Biomed Res Int – volume: 8 issue: 5 year: 2012 ident: 2022031506250617800_ref195 article-title: Resolving the ortholog conjecture: orthologs tend to be weakly, but significantly, more similar in function than paralogs publication-title: PLoS Comput Biol doi: 10.1371/journal.pcbi.1002514 – volume: 18 start-page: 762 issue: 3 year: 2017 ident: 2022031506250617800_ref192 article-title: A tissue-mapped axolotl DE novo transcriptome enables identification of limb regeneration factors publication-title: Cell Rep doi: 10.1016/j.celrep.2016.12.063 – start-page: 55 volume-title: The Gene Ontology Handbook year: 2017 ident: 2022031506250617800_ref196 doi: 10.1007/978-1-4939-3743-1_5 – volume: 20 start-page: 257 issue: 1 year: 2019 ident: 2022031506250617800_ref35 article-title: Improved metagenomic analysis with kraken 2 publication-title: Genome Biol doi: 10.1186/s13059-019-1891-0 – volume: 115 start-page: 9726 issue: 39 year: 2018 ident: 2022031506250617800_ref259 article-title: Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA publication-title: Proc Natl Acad Sci U S A doi: 10.1073/pnas.1806447115 – volume: 22 start-page: 1658 issue: 13 year: 2006 ident: 2022031506250617800_ref107 article-title: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences publication-title: Bioinformatics doi: 10.1093/bioinformatics/btl158 – volume: 14 start-page: 417 issue: 4 year: 2017 ident: 2022031506250617800_ref98 article-title: Salmon provides fast and bias-aware quantification of transcript expression publication-title: Nat Methods doi: 10.1038/nmeth.4197 – volume: 8 start-page: 1494 issue: 8 year: 2013 ident: 2022031506250617800_ref45 article-title: De novo transcript sequence reconstruction from RNA-seq using the trinity platform for reference generation and analysis publication-title: Nat Protoc doi: 10.1038/nprot.2013.084 – volume: 21 start-page: 1720 issue: 5 year: 2020 ident: 2022031506250617800_ref101 article-title: Evaluation of seven different RNA-Seq alignment tools based on experimental data from the model plant arabidopsis thaliana publication-title: Int J Mol Sci doi: 10.3390/ijms21051720 – volume: 6 start-page: 78 issue: 1 year: 2021 ident: 2022031506250617800_ref59 article-title: Alternative splicing and cancer: a systematic review publication-title: Signal Transduct Target Ther doi: 10.1038/s41392-021-00486-7 – volume: 50 start-page: 7 issue: 1 year: 2017 ident: 2022031506250617800_ref253 article-title: De novo transcriptome assembly, functional annotation and differential gene expression analysis of juvenile and adult e. fetida, a model oligochaete used in ecotoxicological studies publication-title: Biol Res doi: 10.1186/s40659-017-0114-y – volume: 30 start-page: 1191 issue: 8 year: 2020 ident: 2022031506250617800_ref67 article-title: RNA-bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes publication-title: Genome Res doi: 10.1101/gr.260174.119 – year: 2021 ident: 2022031506250617800_ref89 article-title: TransPi – a comprehensive TRanscriptome ANalysiS PIpeline for de novo transcriptome assembly doi: 10.1101/2021.02.18.431773 – volume: 10 start-page: 33 year: 2021 ident: 2022031506250617800_ref221 article-title: Sustainable data analysis with snakemake publication-title: F1000Res doi: 10.12688/f1000research.29032.2 – volume: 26 start-page: 1134 issue: 8 year: 2016 ident: 2022031506250617800_ref81 article-title: TransRate: reference-free quality assessment of de novo transcriptome assemblies publication-title: Genome Res doi: 10.1101/gr.196469.115 – volume: 12 start-page: 953 issue: 7 year: 2021 ident: 2022031506250617800_ref90 article-title: Pincho: a modular approach to high quality DE novo transcriptomics publication-title: Genes (Basel) doi: 10.3390/genes12070953 – volume: 21 start-page: 621 issue: 2 year: 2021 ident: 2022031506250617800_ref201 article-title: TOA: a software package for automated functional annotation in non-model plant species publication-title: Mol Ecol Resour doi: 10.1111/1755-0998.13285 – volume: 20 year: 2019 ident: 2022031506250617800_ref214 article-title: Orthofinder: phylogenetic orthology inference for comparative genomics publication-title: Genome Biol doi: 10.1186/s13059-019-1832-y – volume: 8 issue: 1 year: 2017 ident: 2022031506250617800_ref14 article-title: RNA-Seq methods for transcriptome analysis publication-title: Wiley Interdiscip Rev RNA doi: 10.1002/wrna.1364 – year: 2021 ident: 2022031506250617800_ref120 article-title: R: a language and environment for statistical computing – volume: 38 start-page: 500 issue: 5 year: 2006 ident: 2022031506250617800_ref239 article-title: GenePattern 2.0 publication-title: Nat Genet doi: 10.1038/ng0506-500 – volume: 9 start-page: 6222 issue: 1 year: 2019 ident: 2022031506250617800_ref19 article-title: De novo transcriptome assembly and functional annotation in five species of bats publication-title: Sci Rep doi: 10.1038/s41598-019-42560-9 – volume: 8 year: 2019 ident: 2022031506250617800_ref125 article-title: Robust and efficient identification of biomarkers from rna-seq data using median control chart publication-title: F1000Research doi: 10.12688/f1000research.17351.1 – start-page: 723 volume-title: Evolutionary Genomics year: 2019 ident: 2022031506250617800_ref220 doi: 10.1007/978-1-4939-9074-0_24 – volume: 49 start-page: D325 issue: D1 year: 2021 ident: 2022031506250617800_ref180 article-title: The gene ontology resource: enriching a GOld mine publication-title: Nucleic Acids Res doi: 10.1093/nar/gkaa1113 – volume: 4 issue: 10 year: 2008 ident: 2022031506250617800_ref157 article-title: The rough guide to in silico function prediction, or how to use sequence and structure information to predict protein function publication-title: PLoS Comput Biol doi: 10.1371/journal.pcbi.1000160 – volume-title: Python: A dynamic, open source programming language year: 2021 ident: 2022031506250617800_ref242 – volume: 106 start-page: 494 issue: 4 year: 2018 ident: 2022031506250617800_ref244 article-title: High-performance computing service for bioinformatics and data science publication-title: J Med Libr Assoc doi: 10.5195/jmla.2018.512 – volume: 29 start-page: i326 issue: 13 year: 2013 ident: 2022031506250617800_ref66 article-title: IDBA-Tran: a more robust de novo de bruijn graph assembler for transcriptomes with uneven expression levels publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt219 – volume: 17 start-page: 938 issue: 3 year: 2020 ident: 2022031506250617800_ref70 article-title: IsoTree: a new framework for de novo transcriptome assembly from RNA-seq reads publication-title: IEEE/ACM Trans Comput Biol Bioinform doi: 10.1109/TCBB.2018.2808350 – volume: 10 start-page: 57 issue: 1 year: 2009 ident: 2022031506250617800_ref8 article-title: RNA-Seq: a revolutionary tool for transcriptomics publication-title: Nat Rev Genet doi: 10.1038/nrg2484 – start-page: 527 volume-title: Bioinformatics and Biomedical Engineering year: 2015 ident: 2022031506250617800_ref247 doi: 10.1007/978-3-319-16480-9_51 – volume: 10 start-page: 421 issue: 1 year: 2009 ident: 2022031506250617800_ref159 article-title: BLAST+: architecture and applications publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-10-421 – volume: 2 start-page: 139 issue: 1 year: 2019 ident: 2022031506250617800_ref132 article-title: RNA sequencing data: Hitchhiker’s guide to expression analysis publication-title: Annu Rev Biomed Data Sci doi: 10.1146/annurev-biodatasci-072018-021255 – volume: 17 issue: 12 year: 2017 ident: 2022031506250617800_ref200 article-title: Sma3s: a universal tool for easy functional annotation of proteomes and transcriptomes publication-title: Proteomics doi: 10.1002/pmic.201700071 – volume: 18 issue: 1 year: 2017 ident: 2022031506250617800_ref99 article-title: Evaluation and comparison of computational tools for RNA-seq isoform quantification publication-title: BMC Genomics doi: 10.1186/s12864-017-4002-1 – volume: 34 start-page: i884 issue: 17 year: 2018 ident: 2022031506250617800_ref33 article-title: fastp: an ultra-fast all-in-one FASTQ preprocessor publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty560 – volume: 15 year: 2014 ident: 2022031506250617800_ref121 article-title: Moderated estimation of fold change and dispersion for rna-seq data with deseq2 publication-title: Genome Biol doi: 10.1186/s13059-014-0550-8 – start-page: 149 volume-title: Evolutionary Genomics year: 2019 ident: 2022031506250617800_ref194 doi: 10.1007/978-1-4939-9074-0_5 – year: 2021 ident: 2022031506250617800_ref149 article-title: Borf: improved ORF prediction in de-novo assembled transcriptome annotation doi: 10.1101/2021.04.12.439551 – volume: 39 start-page: D19 issue: Database issue year: 2011 ident: 2022031506250617800_ref250 article-title: The sequence read archive publication-title: Nucleic Acids Res doi: 10.1093/nar/gkq1019 – volume: 35 year: 2019 ident: 2022031506250617800_ref118 article-title: Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty895 – volume: 44 start-page: D733 issue: D1 year: 2016 ident: 2022031506250617800_ref140 article-title: Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation publication-title: Nucleic Acids Res doi: 10.1093/nar/gkv1189 – volume: 47 start-page: W636 issue: W1 year: 2019 ident: 2022031506250617800_ref150 article-title: The EMBL-EBI search and sequence analysis tools APIs in 2019 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkz268 – volume: 21 year: 2021 ident: 2022031506250617800_ref216 article-title: Signal, bias, and the role of transcriptome assembly quality in phylogenomic inference publication-title: BMC ecology and evolution doi: 10.1186/s12862-021-01772-2 – volume: 17 start-page: 10 issue: 1 year: 2011 ident: 2022031506250617800_ref30 article-title: Cutadapt removes adapter sequences from high-throughput sequencing reads publication-title: EMBnet J doi: 10.14806/ej.17.1.200 – volume: 14 start-page: 130 issue: 2 year: 2015 ident: 2022031506250617800_ref134 article-title: Measuring differential gene expression with RNA-seq: challenges and strategies for data analysis publication-title: Brief Funct Genomics doi: 10.1093/bfgp/elu035 – volume: 29 start-page: 15 issue: 1 year: 2013 ident: 2022031506250617800_ref94 article-title: STAR: ultrafast universal RNA-seq aligner publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts635 – volume: 35 start-page: 2856 issue: 16 year: 2019 ident: 2022031506250617800_ref110 article-title: MMseqs2 desktop and local web server app for fast, interactive sequence searches publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty1057 – volume: 10 start-page: 1053 issue: 1 year: 2020 ident: 2022031506250617800_ref84 article-title: The rhinella arenarum transcriptome: de novo assembly, annotation and gene prediction publication-title: Sci Rep doi: 10.1038/s41598-020-57961-4 – volume: 428 start-page: 726 issue: 4 year: 2016 ident: 2022031506250617800_ref191 article-title: BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences publication-title: J Mol Biol doi: 10.1016/j.jmb.2015.11.006 – start-page: 75 volume-title: Multiple Sequence Alignment Methods year: 2014 ident: 2022031506250617800_ref156 doi: 10.1007/978-1-62703-646-7_5 – volume: 28 start-page: 1166 issue: 8 year: 2012 ident: 2022031506250617800_ref238 article-title: Unipro UGENE: a unified bioinformatics toolkit publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts091 – volume: 6 year: 2016 ident: 2022031506250617800_ref211 article-title: Famsa: fast and accurate multiple sequence alignment of huge protein families publication-title: Sci Rep doi: 10.1038/srep33964 – volume: 3 issue: 1 year: 2021 ident: 2022031506250617800_ref25 article-title: Sequencing error profiles of illumina sequencing instruments publication-title: NAR Genom Bioinform – volume: 15 start-page: 410 issue: 7 year: 2014 ident: 2022031506250617800_ref62 article-title: Corset: enabling differential gene expression analysis for de novo assembled transcriptomes publication-title: Genome Biol – volume: 28 year: 2000 ident: 2022031506250617800_ref189 article-title: KEGG: Kyoto encyclopedia of genes and genomes publication-title: Nucleic Acids Res doi: 10.1093/nar/28.1.27 – volume: 305 start-page: 567 issue: 3 year: 2001 ident: 2022031506250617800_ref193 article-title: Predicting transmembrane protein topology with a hidden markov model: application to complete genomes publication-title: J Mol Biol doi: 10.1006/jmbi.2000.4315 – volume: 20 start-page: 591 issue: 2 year: 2020 ident: 2022031506250617800_ref197 article-title: EnTAP: bringing faster and smarter functional annotation to non-model eukaryotic transcriptomes publication-title: Mol Ecol Resour doi: 10.1111/1755-0998.13106 – volume-title: Handbook of Hidden Markov Models in Bioinformatics year: 2008 ident: 2022031506250617800_ref169 doi: 10.1201/9781420011807 – volume: 323 start-page: 133 issue: 5910 year: 2009 ident: 2022031506250617800_ref257 article-title: Real-time DNA sequencing from single polymerase molecules publication-title: Science doi: 10.1126/science.1162986 – volume: 18 year: 2017 ident: 2022031506250617800_ref119 article-title: False discovery rates: a new deal publication-title: Biostatistics – volume: 23 start-page: 897 issue: 0 year: 2017 ident: 2022031506250617800_ref237 article-title: Galaksio, a user friendly workflow-centric front end for galaxy publication-title: EMBnet J doi: 10.14806/ej.23.0.897 – volume: 32 start-page: 3047 issue: 19 year: 2016 ident: 2022031506250617800_ref28 article-title: MultiQC: summarize analysis results for multiple tools and samples in a single report publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw354 – volume: 18 start-page: 476 issue: 1 year: 2017 ident: 2022031506250617800_ref174 article-title: fLPS: fast discovery of compositional biases for the protein universe publication-title: BMC Bioinformatics doi: 10.1186/s12859-017-1906-3 – volume: 32 start-page: 896 issue: 9 year: 2014 ident: 2022031506250617800_ref127 article-title: Normalization of RNA-seq data using factor analysis of control genes or samples publication-title: Nat Biotechnol doi: 10.1038/nbt.2931 – volume: 55 issue: 100792 year: 2021 ident: 2022031506250617800_ref249 article-title: De novo transcriptome assembly for pachygrapsus marmoratus, an intertidal brachyuran crab publication-title: Mar Genomics – volume: 12 start-page: S5 issue: S10 year: 2011 ident: 2022031506250617800_ref43 article-title: Evaluation of the coverage and depth of transcriptome by RNA-Seq in chickens publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-12-S10-S5 – volume: 7 start-page: e8206 year: 2019 ident: 2022031506250617800_ref131 article-title: consensusDE: an R package for assessing consensus of multiple RNA-seq algorithms with RUV correction publication-title: PeerJ doi: 10.7717/peerj.8206 – volume: 49 start-page: D266 issue: D1 year: 2021 ident: 2022031506250617800_ref178 article-title: CATH: increased structural coverage of functional space publication-title: Nucleic Acids Res doi: 10.1093/nar/gkaa1079 – volume: 22 start-page: 56 issue: 1 year: 2021 ident: 2022031506250617800_ref51 article-title: 3 ’-5 ’ crosstalk contributes to transcriptional bursting publication-title: Genome Biol doi: 10.1186/s13059-020-02227-5 – volume: 11 start-page: e0157022 issue: 6 year: 2016 ident: 2022031506250617800_ref128 article-title: SARTools: a DESeq2- and EdgeR-based R pipeline for comprehensive differential analysis of RNA-Seq data publication-title: PLoS One doi: 10.1371/journal.pone.0157022 – volume: 30 start-page: 1236 issue: 9 year: 2014 ident: 2022031506250617800_ref176 article-title: InterProScan 5: genome-scale protein function classification publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu031 – volume: 20 start-page: 631 issue: 11 year: 2019 ident: 2022031506250617800_ref7 article-title: RNA sequencing: the teenage years publication-title: Nat Rev Genet doi: 10.1038/s41576-019-0150-2 – volume: 46 start-page: W537 issue: W1 year: 2018 ident: 2022031506250617800_ref23 article-title: The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update publication-title: Nucleic Acids Res doi: 10.1093/nar/gky379 – volume: 22 start-page: 96 issue: 2 year: 2021 ident: 2022031506250617800_ref3 article-title: Gene regulation by long non-coding RNAs and its biological functions publication-title: Nat Rev Mol Cell Biol doi: 10.1038/s41580-020-00315-9 – volume: 12 start-page: 67 issue: 1 year: 2021 ident: 2022031506250617800_ref21 article-title: A de novo transcriptomics approach reveals genes involved in thrips tabaci resistance to spinosad publication-title: Insects doi: 10.3390/insects12010067 – volume: 12 start-page: 444 issue: 1 year: 2011 ident: 2022031506250617800_ref208 article-title: WebMGA: a customizable web server for fast metagenomic sequence analysis publication-title: BMC Genomics doi: 10.1186/1471-2164-12-444 – volume: 122 issue: 1 year: 2018 ident: 2022031506250617800_ref6 article-title: Overview of next-generation sequencing technologies publication-title: Curr Protoc Mol Biol doi: 10.1002/cpmb.59 – volume: 31 start-page: 926 issue: 6 year: 2015 ident: 2022031506250617800_ref163 article-title: UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu739 – volume: 18 start-page: 366 issue: 4 year: 2021 ident: 2022031506250617800_ref160 article-title: Sensitive protein alignments at tree-of-life scale using DIAMOND publication-title: Nat Methods doi: 10.1038/s41592-021-01101-x – volume: 34 start-page: 2115 issue: 8 year: 2017 ident: 2022031506250617800_ref183 article-title: Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper publication-title: Mol Biol Evol doi: 10.1093/molbev/msx148 – volume: 554 start-page: 50 issue: 7690 year: 2018 ident: 2022031506250617800_ref103 article-title: The axolotl genome and the evolution of key tissue formation regulators publication-title: Nature doi: 10.1038/nature25458 – volume: 35 start-page: 316 issue: 4 year: 2017 ident: 2022031506250617800_ref224 article-title: Nextflow enables reproducible computational workflows publication-title: Nat Biotechnol doi: 10.1038/nbt.3820 – volume: 8 issue: 5 year: 2019 ident: 2022031506250617800_ref58 article-title: De novo transcriptome assembly: a comprehensive cross-species comparison of short-read RNA-Seq assemblers publication-title: Gigascience doi: 10.1093/gigascience/giz039 – volume: 25 start-page: 2078 issue: 16 year: 2009 ident: 2022031506250617800_ref95 article-title: The sequence alignment/map format and SAMtools publication-title: Bioinformatics doi: 10.1093/bioinformatics/btp352 – volume: 1 issue: 2 year: 2003 ident: 2022031506250617800_ref252 article-title: The what and whys of DOIs publication-title: PLoS Biol doi: 10.1371/journal.pbio.0000057 – volume: 38 start-page: 276 issue: 3 year: 2020 ident: 2022031506250617800_ref229 article-title: The nf-core framework for community-curated bioinformatics pipelines publication-title: Nat Biotechnol doi: 10.1038/s41587-020-0439-x – volume: 46 start-page: D435 issue: D1 year: 2018 ident: 2022031506250617800_ref179 article-title: Gene3D: extensive prediction of globular domains in proteins publication-title: Nucleic Acids Res doi: 10.1093/nar/gkx1069 – volume: 49 start-page: D389 issue: D1 year: 2021 ident: 2022031506250617800_ref78 article-title: OrthoDB in 2020: evolutionary and functional annotations of orthologs publication-title: Nucleic Acids Res doi: 10.1093/nar/gkaa1009 – volume: 49 issue: 17 ident: 2022031506250617800_ref203 article-title: TRAPID 2.0: a web application for taxonomic and functional analysis of de novo transcriptomes publication-title: Nucleic Acids Res doi: 10.1093/nar/gkab565 – volume: 35 start-page: 3100 issue: 9 year: 2007 ident: 2022031506250617800_ref141 article-title: RNAmmer: consistent and rapid annotation of ribosomal RNA genes publication-title: Nucleic Acids Res doi: 10.1093/nar/gkm160 – volume: 7 start-page: 1559 issue: 1 year: 2017 ident: 2022031506250617800_ref100 article-title: Benchmarking of RNA-sequencing analysis workflows using whole-transcriptome RT-qPCR expression data publication-title: Sci Rep doi: 10.1038/s41598-017-01617-3 – volume: 15 start-page: 357 issue: 1 year: 2014 ident: 2022031506250617800_ref49 article-title: NeatFreq: reference-free data reduction and coverage normalization for de novo sequence assembly publication-title: BMC Bioinformatics doi: 10.1186/s12859-014-0357-3 – volume: 9 start-page: 16 issue: 1 year: 2016 ident: 2022031506250617800_ref233 article-title: Visual programming for next-generation sequencing data analytics publication-title: BioData Min doi: 10.1186/s13040-016-0095-3 – volume: 12 start-page: 254 issue: 3 year: 2013 ident: 2022031506250617800_ref143 article-title: Non-coding RNAs in homeostasis, disease and stress responses: an evolutionary perspective publication-title: Brief Funct Genomics doi: 10.1093/bfgp/elt016 – volume: 6 start-page: 195 issue: 3 year: 2018 ident: 2022031506250617800_ref135 article-title: Modeling and analysis of RNA-seq data: a review from a statistical perspective publication-title: Quant Biol doi: 10.1007/s40484-018-0144-7 – volume: 32 start-page: 3351 issue: 21 year: 2016 ident: 2022031506250617800_ref129 article-title: MetaCycle: an integrated R package to evaluate periodicity in large scale data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw405 – volume: 12 start-page: 87 issue: 2 year: 2011 ident: 2022031506250617800_ref54 article-title: RNA sequencing: advances, challenges and opportunities publication-title: Nat Rev Genet doi: 10.1038/nrg2934 – volume: 20 start-page: 81 issue: 1 year: 2019 ident: 2022031506250617800_ref72 article-title: TransLiG: a de novo transcriptome assembler that uses line graph iteration publication-title: Genome Biol doi: 10.1186/s13059-019-1690-7 – volume: 49 start-page: D899 issue: D1 year: 2021 ident: 2022031506250617800_ref165 article-title: FlyBase: updates to the drosophila melanogaster knowledge base publication-title: Nucleic Acids Res doi: 10.1093/nar/gkaa1026 – volume: 37 start-page: 420 issue: 4 year: 2019 ident: 2022031506250617800_ref173 article-title: SignalP 5.0 improves signal peptide predictions using deep neural networks publication-title: Nat Biotechnol doi: 10.1038/s41587-019-0036-z – volume: 34 start-page: 3265 issue: 19 year: 2018 ident: 2022031506250617800_ref112 article-title: Grouper: graph-based clustering and annotation for improved de novo transcriptome analysis publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty378 – volume: 22 start-page: 322 issue: 1 year: 2021 ident: 2022031506250617800_ref117 article-title: Strategies for detecting and identifying biological signals amidst the variation commonly found in RNA sequencing data publication-title: BMC Genomics doi: 10.1186/s12864-021-07563-9 – volume: 9 start-page: 10513 issue: 18 year: 2019 ident: 2022031506250617800_ref85 article-title: The bellerophon pipeline, improving de novo transcriptomes and removing chimeras publication-title: Ecol Evol doi: 10.1002/ece3.5571 – volume: 21 start-page: 373 issue: 4 year: 2011 ident: 2022031506250617800_ref60 article-title: RNA structure and the mechanisms of alternative splicing publication-title: Curr Opin Genet Dev doi: 10.1016/j.gde.2011.04.001 – volume: 18 issue: 1 year: 2017 ident: 2022031506250617800_ref114 article-title: SuperTranscripts: a data driven reference for analysis and visualisation of transcriptomes publication-title: Genome Biol – volume: 16 start-page: 603 issue: 7 year: 2019 ident: 2022031506250617800_ref152 article-title: Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold publication-title: Nat Methods doi: 10.1038/s41592-019-0437-4 – volume: 15 start-page: 553 issue: 12 year: 2014 ident: 2022031506250617800_ref82 article-title: Evaluation of de novo transcriptome assemblies from RNA-Seq data publication-title: Genome Biol doi: 10.1186/s13059-014-0553-5 – volume-title: Stack Overflow Developer Survey year: 2020 ident: 2022031506250617800_ref226 – volume: 8 issue: 7 year: 2013 ident: 2022031506250617800_ref205 article-title: TCW: transcriptome computational workbench publication-title: PLoS One doi: 10.1371/journal.pone.0069401 – volume: 12 issue: 2 year: 2016 ident: 2022031506250617800_ref57 article-title: BinPacker: packing-based DE novo transcriptome assembly from RNA-seq data publication-title: PLoS Comput Biol doi: 10.1371/journal.pcbi.1004772 – volume: Chapter 3 issue: 1 year: 2013 ident: 2022031506250617800_ref154 article-title: An introduction to sequence similarity (“homology”) searching publication-title: Curr Protoc Bioinformatics doi: 10.1002/0471250953.bi0301s42 – volume: 30 start-page: 2114 issue: 15 year: 2014 ident: 2022031506250617800_ref34 article-title: Trimmomatic: a flexible trimmer for illumina sequence data publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu170 – volume: 6 year: 2018 ident: 2022031506250617800_ref88 article-title: The oyster river protocol: a multi-assembler and kmer approach for de novo transcriptome assembly publication-title: PeerJ doi: 10.7717/peerj.5428 – volume: 48 start-page: D265 issue: D1 year: 2020 ident: 2022031506250617800_ref199 article-title: CDD/SPARCLE: the conserved domain database in 2020 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkz991 – volume: 374 issue: 1786 year: 2019 ident: 2022031506250617800_ref254 article-title: Realizing the potential of full-length transcriptome sequencing publication-title: Philos Trans R Soc Lond B Biol Sci doi: 10.1098/rstb.2019.0097 – volume: 28 start-page: 1086 issue: 8 year: 2012 ident: 2022031506250617800_ref64 article-title: Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts094 – volume: 29 start-page: 1152 issue: 7 year: 2019 ident: 2022031506250617800_ref207 article-title: OMA standalone: orthology inference among public and custom genomes and transcriptomes publication-title: Genome Res doi: 10.1101/gr.243212.118 – volume: 6 issue: 57 year: 2021 ident: 2022031506250617800_ref232 article-title: The targets R package: a dynamic make-like function-oriented pipeline toolkit for reproducibility and high-performance computing publication-title: J Open Source Softw doi: 10.21105/joss.02959 – volume: 8 start-page: 1874 year: 2019 ident: 2022031506250617800_ref27 article-title: Falco: high-speed FastQC emulation for quality control of sequencing data publication-title: F1000Res doi: 10.12688/f1000research.21142.1 – volume: 33 start-page: 2583 issue: 16 year: 2017 ident: 2022031506250617800_ref248 article-title: MISA-web: a web server for microsatellite prediction publication-title: Bioinformatics doi: 10.1093/bioinformatics/btx198 – volume: 16 start-page: 30 issue: 1 year: 2015 ident: 2022031506250617800_ref71 article-title: Bridger: a new framework for de novo transcriptome assembly using RNA-seq data publication-title: Genome Biol doi: 10.1186/s13059-015-0596-2 – volume: 11 start-page: 220 issue: 12 year: 2010 ident: 2022031506250617800_ref115 article-title: From RNA-seq reads to differential expression results publication-title: Genome Biol doi: 10.1186/gb-2010-11-12-220 – volume: 12 start-page: 323 issue: 1 year: 2011 ident: 2022031506250617800_ref96 article-title: RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-12-323 – volume: 15 start-page: 2147 issue: 8 year: 2013 ident: 2022031506250617800_ref168 article-title: Pico-PLAZA, a genome database of microbial photosynthetic eukaryotes publication-title: Environ Microbiol doi: 10.1111/1462-2920.12174 – volume: 26 start-page: 1134 issue: 8 year: 2016 ident: 2022031506250617800_ref80 article-title: TransRate: reference-free quality assessment of de novo transcriptome assemblies publication-title: Genome Res doi: 10.1101/gr.196469.115 – volume: 34 start-page: 525 issue: 5 year: 2016 ident: 2022031506250617800_ref97 article-title: Near-optimal probabilistic RNA-seq quantification publication-title: Nat Biotechnol doi: 10.1038/nbt.3519 – volume: 10 issue: 1 year: 2021 ident: 2022031506250617800_ref223 article-title: Streamlining data-intensive biology with workflow systems publication-title: Gigascience doi: 10.1093/gigascience/giaa140 – volume-title: Common workflow language year: 2016 ident: 2022031506250617800_ref225 – volume: 5 issue: e2988 year: 2017 ident: 2022031506250617800_ref87 article-title: Compacting and correcting trinity and oases RNA-Seq de novo assemblies publication-title: PeerJ – year: 2019 ident: 2022031506250617800_ref204 article-title: Transcriptome computational workbench (TCW): analysis of single and comparative transcriptomes doi: 10.1101/733311 – volume: 7 start-page: 909 issue: 11 year: 2010 ident: 2022031506250617800_ref65 article-title: De novo assembly and analysis of RNA-seq data publication-title: Nat Methods doi: 10.1038/nmeth.1517 – volume: 49 start-page: D192 issue: D1 year: 2021 ident: 2022031506250617800_ref41 article-title: Rfam 14: expanded coverage of metagenomic, viral and microRNA families publication-title: Nucleic Acids Res doi: 10.1093/nar/gkaa1047 – volume: 10 start-page: 317 year: 2019 ident: 2022031506250617800_ref11 article-title: Single-cell RNA-seq technologies and related computational data analysis publication-title: Front Genet doi: 10.3389/fgene.2019.00317 – volume: 12 issue: 10 year: 2017 ident: 2022031506250617800_ref31 article-title: BBMerge – accurate paired shotgun read merging via overlap publication-title: PLoS One doi: 10.1371/journal.pone.0185056 – volume: 30 year: 2014 ident: 2022031506250617800_ref212 article-title: Raxml version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies publication-title: Bioinformatics doi: 10.1093/bioinformatics/btu033 – volume: 47 start-page: D309 issue: D1 year: 2019 ident: 2022031506250617800_ref184 article-title: eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses publication-title: Nucleic Acids Res doi: 10.1093/nar/gky1085 – volume: 45 start-page: W12 issue: W1 year: 2017 ident: 2022031506250617800_ref137 article-title: CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features publication-title: Nucleic Acids Res doi: 10.1093/nar/gkx428 – volume: 4 start-page: 48 issue: 1 year: 2015 ident: 2022031506250617800_ref29 article-title: Rcorrector: efficient and accurate error correction for illumina RNA-seq reads publication-title: Gigascience doi: 10.1186/s13742-015-0089-y – volume: 21 start-page: 630 issue: 10 year: 2020 ident: 2022031506250617800_ref1 article-title: mRNAs, proteins and the emerging principles of gene expression control publication-title: Nat Rev Genet doi: 10.1038/s41576-020-0258-4 – volume: 30 year: 2013 ident: 2022031506250617800_ref210 article-title: Mafft multiple sequence alignment software version 7: improvements in performance and usability publication-title: Mol Biol Evol doi: 10.1093/molbev/mst010 – volume: 18 start-page: 392 issue: 8 year: 2020 ident: 2022031506250617800_ref18 article-title: E novo transcriptome assembly and gene expression profiling of the copepod calanus helgolandicus feeding on the PUA-producing diatom skeletonema marinoi publication-title: Mar Drugs doi: 10.3390/md18080392 – volume: 46 start-page: D8 issue: D1 year: 2018 ident: 2022031506250617800_ref161 article-title: Database resources of the national center for biotechnology information publication-title: Nucleic Acids Res doi: 10.1093/nar/gkx1095 – volume: 23 start-page: 1282 issue: 10 year: 2007 ident: 2022031506250617800_ref164 article-title: UniRef: comprehensive and non-redundant UniProt reference clusters publication-title: Bioinformatics doi: 10.1093/bioinformatics/btm098 – volume: 26 start-page: 1721 issue: 12 year: 2016 ident: 2022031506250617800_ref36 article-title: Centrifuge: rapid and sensitive classification of metagenomic sequences publication-title: Genome Res doi: 10.1101/gr.210641.116 – volume: 32 start-page: 2577 issue: 17 year: 2016 ident: 2022031506250617800_ref79 article-title: DOGMA: domain-based transcriptome and proteome quality assessment publication-title: Bioinformatics doi: 10.1093/bioinformatics/btw231 – volume: 29 start-page: 644 issue: 7 year: 2011 ident: 2022031506250617800_ref46 article-title: Full-length transcriptome assembly from RNA-Seq data without a reference genome publication-title: Nat Biotechnol doi: 10.1038/nbt.1883 – volume: 14 start-page: 318 issue: 3 year: 2005 ident: 2022031506250617800_ref219 article-title: Rule-based workflow management for bioinformatics publication-title: VLDB J doi: 10.1007/s00778-005-0153-9 – volume: 117 start-page: 3224 issue: 10 year: 2020 ident: 2022031506250617800_ref144 article-title: Expanding the chinese hamster ovary cell long noncoding RNA transcriptome using RNASeq publication-title: Biotechnol Bioeng doi: 10.1002/bit.27467 – volume: 573 start-page: 149 issue: 7772 year: 2019 ident: 2022031506250617800_ref218 article-title: Workflow systems turn raw data into scientific knowledge publication-title: Nature doi: 10.1038/d41586-019-02619-z – volume: 8 issue: 9 year: 2019 ident: 2022031506250617800_ref56 article-title: rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data publication-title: Gigascience doi: 10.1093/gigascience/giz100 – volume: 8 start-page: 186 issue: 3 year: 1998 ident: 2022031506250617800_ref32 article-title: Base-calling of automated sequencer traces using phred. II. Error probabilities publication-title: Genome Res doi: 10.1101/gr.8.3.186 – volume: 49 start-page: D480 issue: D1 year: 2021 ident: 2022031506250617800_ref162 article-title: UniProt: the universal protein knowledgebase in 2021 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkaa1100 – volume: 16 start-page: 71 issue: 2 year: 2015 ident: 2022031506250617800_ref4 article-title: RNA-mediated epigenetic regulation of gene expression publication-title: Nat Rev Genet doi: 10.1038/nrg3863 – volume: 20 start-page: 92 issue: 1 year: 2019 ident: 2022031506250617800_ref13 article-title: Next-generation genome annotation: we still struggle to get it right publication-title: Genome Biol doi: 10.1186/s13059-019-1715-2 – year: 2015 ident: 2022031506250617800_ref136 article-title: RNAseq by total RNA library identifies additional RNAs compared to poly(a) RNA library publication-title: Biomed Res Int doi: 10.1155/2015/862130 – volume: 11 start-page: 128 issue: 8 year: 2010 ident: 2022031506250617800_ref234 article-title: Schatz publication-title: The missing graphical user interface for genomics Genome Biol – volume: 19 start-page: 45 issue: 1 year: 2018 ident: 2022031506250617800_ref2 article-title: The emerging complexity of the tRNA world: mammalian tRNAs beyond protein synthesis publication-title: Nat Rev Mol Cell Biol doi: 10.1038/nrm.2017.77 – volume: 29 start-page: 2933 issue: 22 year: 2013 ident: 2022031506250617800_ref139 article-title: Infernal 1.1: 100-fold faster RNA homology searches publication-title: Bioinformatics doi: 10.1093/bioinformatics/btt509 – volume: 17 start-page: 13 year: 2016 ident: 2022031506250617800_ref91 article-title: A survey of best practices for RNA-seq data analysis publication-title: Genome Biol doi: 10.1186/s13059-016-0881-8 – volume: 21 start-page: 148 issue: 1 year: 2020 ident: 2022031506250617800_ref113 article-title: Compacta: a fast contig clustering tool for de novo assembled transcriptomes publication-title: BMC Genomics doi: 10.1186/s12864-020-6528-x – volume: 36 start-page: 3420 issue: 10 year: 2008 ident: 2022031506250617800_ref186 article-title: High-throughput functional annotation and data mining with the Blast2GO suite publication-title: Nucleic Acids Res doi: 10.1093/nar/gkn176 – volume: 25 start-page: 1224 issue: 6 year: 2016 ident: 2022031506250617800_ref17 article-title: The power and promise of RNA-seq in ecology and evolution publication-title: Mol Ecol doi: 10.1111/mec.13526 – volume: 8 issue: 1 year: 2018 ident: 2022031506250617800_ref37 article-title: Evaluation of two main RNA-seq approaches for gene quantification in clinical RNA sequencing: polya+ selection versus rRNA depletion publication-title: Sci Rep – volume: 48 start-page: D762 issue: D1 year: 2020 ident: 2022031506250617800_ref166 article-title: WormBase: a modern model organism information resource publication-title: Nucleic Acids Res – volume: 15 start-page: 403 issue: 2 year: 2014 ident: 2022031506250617800_ref236 article-title: Dissemination of scientific software with galaxy ToolShed publication-title: Genome Biol doi: 10.1186/gb4161 – volume: 35 start-page: 1960 issue: 11 year: 2019 ident: 2022031506250617800_ref92 article-title: TPMCalculator: one-step software to quantify mRNA abundance of genomic features publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty896 – volume: 12 start-page: 671 issue: 10 year: 2011 ident: 2022031506250617800_ref15 article-title: Next-generation transcriptome assembly publication-title: Nat Rev Genet doi: 10.1038/nrg3068 – volume: 21 issue: 1 year: 2020 ident: 2022031506250617800_ref255 article-title: Opportunities and challenges in long-read sequencing data analysis publication-title: Genome Biol doi: 10.1186/s13059-020-1935-5 – volume: 11 issue: 10 year: 2016 ident: 2022031506250617800_ref74 article-title: SeqKit: a cross-platform and ultrafast toolkit for FASTA/Q file manipulation publication-title: PLoS One doi: 10.1371/journal.pone.0163962 – volume: 21 start-page: 153 issue: 1 year: 2020 ident: 2022031506250617800_ref145 article-title: Pan-tissue transcriptome analysis of long noncoding RNAs in the american beaver castor canadensis publication-title: BMC Genomics doi: 10.1186/s12864-019-6432-4 – volume-title: guigolab/FA-nf: 0.3.1 release year: 2021 ident: 2022031506250617800_ref206 – volume: 7 start-page: 9 issue: 1 year: 2020 ident: 2022031506250617800_ref20 article-title: De novo transcriptome assembly and annotation for gene discovery in avocado, macadamia and mango publication-title: Sci Data doi: 10.1038/s41597-019-0350-9 – volume: 21 start-page: 18 issue: 1 year: 2021 ident: 2022031506250617800_ref61 article-title: Error, noise and bias in de novo transcriptome assemblies publication-title: Mol Ecol Resour doi: 10.1111/1755-0998.13156 – volume: 46 start-page: D1190 issue: D1 year: 2018 ident: 2022031506250617800_ref167 article-title: PLAZA 4.0: an integrative resource for functional, evolutionary and comparative plant genomics publication-title: Nucleic Acids Res doi: 10.1093/nar/gkx1002 – volume-title: The gene ontology handbook year: 2016 ident: 2022031506250617800_ref182 – volume-title: The Linux Command Line: A Complete Introduction year: 2019 ident: 2022031506250617800_ref240 – volume: 1 year: 2017 ident: 2022031506250617800_ref126 article-title: Importing transcript abundance datasets with tximport publication-title: Dim Txi Inf Rep Sample1 – volume-title: Sequence - Evolution - Function: Computational Approaches in Comparative Genomics year: 2003 ident: 2022031506250617800_ref153 doi: 10.1007/978-1-4757-3783-7 – volume: 14 start-page: 103 issue: 2 year: 2007 ident: 2022031506250617800_ref52 article-title: Transcriptional noise and the fidelity of initiation by RNA polymerase II publication-title: Nat Struct Mol Biol doi: 10.1038/nsmb0207-103 – volume: 49 start-page: D373 year: 1 ident: 2022031506250617800_ref185 article-title: OMA orthology in 2021: website overhaul, conserved isoforms, ancestral gene order and more publication-title: Nucleic Acids Res doi: 10.1093/nar/gkaa1007 – volume: 14 start-page: 755 issue: 9 year: 1998 ident: 2022031506250617800_ref171 article-title: Profile hidden markov models publication-title: Bioinformatics doi: 10.1093/bioinformatics/14.9.755 – volume: 28 start-page: 2520 issue: 19 year: 2012 ident: 2022031506250617800_ref227 article-title: Snakemake–a scalable bioinformatics workflow engine publication-title: Bioinformatics doi: 10.1093/bioinformatics/bts480 – volume: 215 start-page: 403 issue: 3 year: 1990 ident: 2022031506250617800_ref158 article-title: Basic local alignment search tool publication-title: J Mol Biol doi: 10.1016/S0022-2836(05)80360-2 – ident: 2022031506250617800_ref228 – volume: 110 start-page: 1038 issue: 5 year: 2016 ident: 2022031506250617800_ref235 article-title: Models and simulations as a service: exploring the use of galaxy for delivering computational models publication-title: Biophys J doi: 10.1016/j.bpj.2015.12.041 – volume: 28 start-page: 1947 issue: 11 year: 2019 ident: 2022031506250617800_ref188 article-title: Toward understanding the origin and evolution of cellular organisms publication-title: Protein Sci doi: 10.1002/pro.3715 – volume: 9 start-page: 29 issue: Suppl 1 year: 2015 ident: 2022031506250617800_ref10 article-title: Advanced applications of RNA sequencing and challenges publication-title: Bioinform Biol Insights – volume: 21 start-page: 352 issue: Suppl 10 year: 2020 ident: 2022031506250617800_ref245 article-title: ELIXIR-IT HPC@CINECA: high performance computing resources for the bioinformatics community publication-title: BMC Bioinformatics doi: 10.1186/s12859-020-03565-8 – volume: 14 start-page: 1097 issue: 6 year: 2014 ident: 2022031506250617800_ref256 article-title: A first look at the oxford nanopore MinION sequencer publication-title: Mol Ecol Resour doi: 10.1111/1755-0998.12324 – volume: 84 start-page: 4355 issue: 13 year: 1987 ident: 2022031506250617800_ref170 article-title: Profile analysis: detection of distantly related proteins publication-title: Proc Natl Acad Sci U S A doi: 10.1073/pnas.84.13.4355 – start-page: 137 volume-title: RNA Bioinformatics year: 2015 ident: 2022031506250617800_ref38 doi: 10.1007/978-1-4939-2291-8_8 – volume: 38 issue: 12 year: 2010 ident: 2022031506250617800_ref53 article-title: Biases in illumina transcriptome sequencing caused by random hexamer priming publication-title: Nucleic Acids Res doi: 10.1093/nar/gkq224 – volume: 49 start-page: D412 issue: D1 year: 2021 ident: 2022031506250617800_ref177 article-title: Pfam: the protein families database in 2021 publication-title: Nucleic Acids Res doi: 10.1093/nar/gkaa913 – volume-title: RECOMB international workshop on comparative genomics year: 2017 ident: 2022031506250617800_ref213 – volume: 7 issue: 8 year: 2012 ident: 2022031506250617800_ref39 article-title: Selective depletion of rRNA enables whole transcriptome profiling of archival fixed tissue publication-title: PLoS One doi: 10.1371/journal.pone.0042882 – start-page: 227 volume-title: Gene Prediction year: 2019 ident: 2022031506250617800_ref77 doi: 10.1007/978-1-4939-9173-0_14 – volume: 18 start-page: 762 issue: 3 year: 2017 ident: 2022031506250617800_ref75 article-title: A tissue-mapped axolotl DE novo transcriptome enables identification of limb regeneration factors publication-title: Cell Rep doi: 10.1016/j.celrep.2016.12.063 – volume: 489 start-page: 57 issue: 7414 year: 2012 ident: 2022031506250617800_ref104 article-title: An integrated encyclopedia of DNA elements in the human genome publication-title: Nature doi: 10.1038/nature11247 – volume: 9 issue: 6 year: 2013 ident: 2022031506250617800_ref105 article-title: Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs publication-title: PLoS Genet doi: 10.1371/journal.pgen.1003569 – volume: 18 start-page: 324 issue: 1 year: 2017 ident: 2022031506250617800_ref48 article-title: An improved filtering algorithm for big read datasets and its application to single-cell assembly publication-title: BMC Bioinformatics doi: 10.1186/s12859-017-1724-7 – volume: 15 issue: 8 year: 2020 ident: 2022031506250617800_ref73 article-title: De novo sequence assembly requires bioinformatic checking of chimeric sequences publication-title: PLoS One – volume: 49 start-page: D545 issue: D1 year: 2021 ident: 2022031506250617800_ref187 article-title: KEGG: integrating viruses and cellular organisms publication-title: Nucleic Acids Res doi: 10.1093/nar/gkaa970 – volume: 316 start-page: 1484 issue: 5830 year: 2007 ident: 2022031506250617800_ref142 article-title: RNA maps reveal new RNA classes and a possible function for pervasive transcription publication-title: Science doi: 10.1126/science.1138341 – volume: 25 start-page: 25 issue: 1 year: 2000 ident: 2022031506250617800_ref181 article-title: Gene ontology: tool for the unification of biology. The gene ontology consortium publication-title: Nat Genet doi: 10.1038/75556 – volume: 31 start-page: 2199 issue: 13 year: 2015 ident: 2022031506250617800_ref198 article-title: Annocript: a flexible pipeline for the annotation of transcriptomes able to identify putative long noncoding RNAs publication-title: Bioinformatics doi: 10.1093/bioinformatics/btv106 – volume: 43 start-page: e78 issue: 12 year: 2015 ident: 2022031506250617800_ref148 article-title: Identification of protein coding regions in RNA transcripts publication-title: Nucleic Acids Res doi: 10.1093/nar/gkv227 – volume: 48 start-page: D498 issue: D1 year: 2020 ident: 2022031506250617800_ref190 article-title: The reactome pathway knowledgebase publication-title: Nucleic Acids Res – volume: 35 start-page: 1026 issue: 11 year: 2017 ident: 2022031506250617800_ref108 article-title: MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets publication-title: Nat Biotechnol doi: 10.1038/nbt.3988 – volume: 18 start-page: S181 issue: Suppl 1 year: 2002 ident: 2022031506250617800_ref69 article-title: Splicing graphs and EST assembly problem publication-title: Bioinformatics doi: 10.1093/bioinformatics/18.suppl_1.S181 – volume: 24 start-page: 1258 issue: 6 year: 2019 ident: 2022031506250617800_ref106 article-title: Alternative splicing, RNA-seq and drug discovery publication-title: Drug Discov Today doi: 10.1016/j.drudis.2019.03.030 – volume: 41 start-page: D590 issue: Database issue year: 2013 ident: 2022031506250617800_ref42 article-title: The SILVA ribosomal RNA gene database project: improved data processing and web-based tools publication-title: Nucleic Acids Res – volume: 409 start-page: 860 issue: 6822 year: 2001 ident: 2022031506250617800_ref76 article-title: Initial sequencing and analysis of the human genome publication-title: Nature doi: 10.1038/35057062 – volume: 20 start-page: 698 issue: Suppl 25 year: 2019 ident: 2022031506250617800_ref68 article-title: DTA-SiST: de novo transcriptome assembly by using simplified suffix trees publication-title: BMC Bioinformatics doi: 10.1186/s12859-019-3272-9 – volume: 9 issue: 1 year: 2018 ident: 2022031506250617800_ref109 article-title: Clustering huge protein sequence sets in linear time publication-title: Nat Commun doi: 10.1038/s41467-018-04964-5 |
SSID | ssj0020781 |
Score | 2.5862803 |
SecondaryResourceType | review_article |
Snippet | Abstract
A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome... A transcriptome constructed from short-read RNA sequencing (RNA-seq) is an easily attainable proxy catalog of protein-coding genes when genome assembly is... |
SourceID | pubmedcentral proquest pubmed crossref oup |
SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source Publisher |
SubjectTerms | Annotations Assembly Gene sequencing Genome Genomes High-Throughput Nucleotide Sequencing Molecular Sequence Annotation Proteins Review Ribonucleic acid RNA Sequence Analysis, RNA - methods Transcriptome Transcriptomes Workflow |
Title | A simple guide to de novo transcriptome assembly and annotation |
URI | https://www.ncbi.nlm.nih.gov/pubmed/35076693 https://www.proquest.com/docview/2640678765 https://www.proquest.com/docview/2622660589 https://pubmed.ncbi.nlm.nih.gov/PMC8921630 |
Volume | 23 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwhV3dS8MwEA8yEHwRv51OjbAnoaxt0qR5kiGOIagvG-ytNM1VB1srrhP233tZu7KNoQ_tS65fd2nud9zdL4S00UkaVxhc_biPAUpqmKNDA44QIU-lVqmWtsH59U30h_xlFIyqAtnZjhS-Yh091h2tYx0IS-qJ7tdS5A_eR3VcZflqyiYi6Vh296oNb-vaDcez0cy2him3SyPXfE3viBxWIJF2S6sekz3ITsh-uW3k4pQ8dulsbFl96cd8bIAWOcVzlv_ktLCuZ7kQ5FOgCIxhqicLGmcGjywv8-5nZNh7Hjz1nWojBCfh3C2cxOA3qsSLOSgGPqDaLQ-oqzEc0V7qJ8qA4SaRMlYsBIxqA6Nc8EwIOlAxsHPSyPIMLgnVIgQPpBFBgrERsNgysPGUex4zCPZkkzystBQlFUu43axiEpXZahahSqNKpU3SroW_SnKM3WJ3qO6_JVorU0TVPzSLEKpZVypF0CT39TDOfpvSiDPI51YG4aPN7KomuSgtVz-HIdQVQuHN5YZNawHLrL05ko0_lwzbofIRp7pX_774NTnwbT_EssCvRRrF9xxuEKUU-nY5R38BmaLmHA |
linkProvider | Oxford University Press |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+simple+guide+to+de+novo+transcriptome+assembly+and+annotation&rft.jtitle=Briefings+in+bioinformatics&rft.au=Raghavan%2C+Venket&rft.au=Kraft%2C+Louis&rft.au=Mesny%2C+Fantin&rft.au=Rigerte%2C+Linda&rft.date=2022-03-10&rft.pub=Oxford+University+Press&rft.issn=1467-5463&rft.eissn=1477-4054&rft.volume=23&rft.issue=2&rft_id=info:doi/10.1093%2Fbib%2Fbbab563&rft_id=info%3Apmid%2F35076693&rft.externalDocID=PMC8921630 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1467-5463&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1467-5463&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1467-5463&client=summon |