proovframe: frameshift-correction for long-read (meta)genomics
Long-read sequencing technologies hold big promises for the genomic analysis of complex samples such as microbial communities. Yet, despite improving accuracy, basic gene prediction on long-read data is still often impaired by frameshifts resulting from small indels. Consensus polishing using either...
Saved in:
Published in | bioRxiv |
---|---|
Main Authors | , , , , , , , , |
Format | Paper |
Language | English |
Published |
Cold Spring Harbor
Cold Spring Harbor Laboratory Press
24.08.2021
Cold Spring Harbor Laboratory |
Edition | 1.1 |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Long-read sequencing technologies hold big promises for the genomic analysis of complex samples such as microbial communities. Yet, despite improving accuracy, basic gene prediction on long-read data is still often impaired by frameshifts resulting from small indels. Consensus polishing using either complementary short reads or to a lesser extent the long reads themselves can mitigate this effect but requires universally high sequencing depth, which is difficult to achieve in complex samples where the majority of community members are rare. Here we present proovframe, a software implementing an alternative approach to overcome frameshift errors in long-read assemblies and raw long reads. We utilize protein-to-nucleotide alignments against reference databases to pinpoint indels in contigs or reads and correct them by deleting or inserting 1-2 bases, thereby conservatively restoring reading-frame fidelity in aligned regions. Using simulated and real-world benchmark data we show that proovframe performs comparably to short-read-based polishing on assembled data, works well with remote protein homologs, and can even be applied to raw reads directly. Together, our results demonstrate that protein-guided frameshift correction significantly improves the analyzability of long-read data both in combination with and as an alternative to common polishing strategies. Proovframe is available from https://github.com/thackl/proovframe. Competing Interest Statement The authors have declared no competing interest. Footnotes * https://github.com/thackl/proovframe * http://github.com/thackl/proovframe-benchmark * https://doi.org/10.5281/zenodo.5164669 |
---|---|
AbstractList | Long-read sequencing technologies hold big promises for the genomic analysis of complex samples such as microbial communities. Yet, despite improving accuracy, basic gene prediction on long-read data is still often impaired by frameshifts resulting from small indels. Consensus polishing using either complementary short reads or to a lesser extent the long reads themselves can mitigate this effect but requires universally high sequencing depth, which is difficult to achieve in complex samples where the majority of community members are rare. Here we present proovframe, a software implementing an alternative approach to overcome frameshift errors in long-read assemblies and raw long reads. We utilize protein-to-nucleotide alignments against reference databases to pinpoint indels in contigs or reads and correct them by deleting or inserting 1-2 bases, thereby conservatively restoring reading-frame fidelity in aligned regions. Using simulated and real-world benchmark data we show that proovframe performs comparably to short-read-based polishing on assembled data, works well with remote protein homologs, and can even be applied to raw reads directly. Together, our results demonstrate that protein-guided frameshift correction significantly improves the analyzability of long-read data both in combination with and as an alternative to common polishing strategies. Proovframe is available from https://github.com/thackl/proovframe. Long-read sequencing technologies hold big promises for the genomic analysis of complex samples such as microbial communities. Yet, despite improving accuracy, basic gene prediction on long-read data is still often impaired by frameshifts resulting from small indels. Consensus polishing using either complementary short reads or to a lesser extent the long reads themselves can mitigate this effect but requires universally high sequencing depth, which is difficult to achieve in complex samples where the majority of community members are rare. Here we present proovframe, a software implementing an alternative approach to overcome frameshift errors in long-read assemblies and raw long reads. We utilize protein-to-nucleotide alignments against reference databases to pinpoint indels in contigs or reads and correct them by deleting or inserting 1-2 bases, thereby conservatively restoring reading-frame fidelity in aligned regions. Using simulated and real-world benchmark data we show that proovframe performs comparably to short-read-based polishing on assembled data, works well with remote protein homologs, and can even be applied to raw reads directly. Together, our results demonstrate that protein-guided frameshift correction significantly improves the analyzability of long-read data both in combination with and as an alternative to common polishing strategies. Proovframe is available from https://github.com/thackl/proovframe. Competing Interest Statement The authors have declared no competing interest. Footnotes * https://github.com/thackl/proovframe * http://github.com/thackl/proovframe-benchmark * https://doi.org/10.5281/zenodo.5164669 |
Author | Burger, Andrew Trigodet, Florian Biller, Steven J Eren, A Murat Delong, Edward F Hackl, Thomas Eppley, John M Fischer, Matthias G Luo, Elaine |
Author_xml | – sequence: 1 givenname: Thomas surname: Hackl fullname: Hackl, Thomas – sequence: 2 givenname: Florian surname: Trigodet fullname: Trigodet, Florian – sequence: 3 givenname: A surname: Eren middlename: Murat fullname: Eren, A Murat – sequence: 4 givenname: Steven surname: Biller middlename: J fullname: Biller, Steven J – sequence: 5 givenname: John surname: Eppley middlename: M fullname: Eppley, John M – sequence: 6 givenname: Elaine surname: Luo fullname: Luo, Elaine – sequence: 7 givenname: Andrew surname: Burger fullname: Burger, Andrew – sequence: 8 givenname: Edward surname: Delong middlename: F fullname: Delong, Edward F – sequence: 9 givenname: Matthias surname: Fischer middlename: G fullname: Fischer, Matthias G |
BookMark | eNpNjz1PwzAURS0EEqX0B7BFYilDwvN7ju0wIKGKL6kSC8xWkj6XVE1cnLSCf0-hDEznDldX95yJ4y50LMSFhExKkNcIKDOwGVKmckNkj8QIdYGpRciP_-VTMen7FQBgoSUZNRK3mxjCzsey5ZvkF_1744e0DjFyPTShS3yIyTp0yzRyuUimLQ_l1ZK70DZ1fy5OfLnuefLHsXh7uH-dPaXzl8fn2d08rSQom-YECtFY8Fp5uWCvJbAuSjCVtUgMGpQhrjSjyauKjUa5QE-1Nliw8jQW08Nu1YT42ezcJjZtGb_cj7oD65DcQX1fvTxU92YfW-4Htwrb2O3fOcw1FUorUvQNdwBYwA |
Cites_doi | 10.1038/nmeth.2474 10.2139/ssrn.3817805 10.1101/2021.03.02.433653 10.1007/978-1-61779-361-5_15 10.1038/ismej.2017.101 10.5281/zenodo.5164669 10.1101/2021.03.03.433801 10.1101/2020.11.11.378109 |
ContentType | Paper |
Copyright | 2021. This article is published under http://creativecommons.org/licenses/by-nc/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. 2021, Posted by Cold Spring Harbor Laboratory |
Copyright_xml | – notice: 2021. This article is published under http://creativecommons.org/licenses/by-nc/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: 2021, Posted by Cold Spring Harbor Laboratory |
DBID | 8FE 8FH ABUWG AFKRA AZQEC BBNVY BENPR BHPHI CCPQU DWQXO GNUQQ HCIFZ LK8 M7P PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS FX. |
DOI | 10.1101/2021.08.23.457338 |
DatabaseName | ProQuest SciTech Collection ProQuest Natural Science Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials Biological Science Collection ProQuest Central Natural Science Collection ProQuest One Community College ProQuest Central ProQuest Central Student SciTech Premium Collection Biological Sciences Biological Science Database ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China bioRxiv |
DatabaseTitle | Publicly Available Content Database ProQuest Central Student ProQuest One Academic Middle East (New) ProQuest Biological Science Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Natural Science Collection Biological Science Database ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest One Academic UKI Edition Natural Science Collection ProQuest Central Korea Biological Science Collection ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: FX. name: bioRxiv url: https://www.biorxiv.org/ sourceTypes: Open Access Repository – sequence: 2 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Biology |
EISSN | 2692-8205 |
Edition | 1.1 |
ExternalDocumentID | 2021.08.23.457338v1 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FH ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BBNVY BENPR BHPHI CCPQU DWQXO GNUQQ HCIFZ LK8 M7P NQS PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PROAC RHI FX. |
ID | FETCH-LOGICAL-b1048-530422780f64f1def610e69a07b8823e060473eb6e275bbe7621d2f3c6729e4f3 |
IEDL.DBID | FX. |
ISSN | 2692-8205 |
IngestDate | Tue Jan 07 18:56:43 EST 2025 Fri Jul 25 09:18:41 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
License | This pre-print is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), CC BY-NC 4.0, as described at http://creativecommons.org/licenses/by-nc/4.0 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-b1048-530422780f64f1def610e69a07b8823e060473eb6e275bbe7621d2f3c6729e4f3 |
Notes | SourceType-Working Papers-1 ObjectType-Working Paper/Pre-Print-1 content type line 50 Competing Interest Statement: The authors have declared no competing interest. |
ORCID | 0000-0001-5334-1704 0000-0001-9013-4827 0000-0003-0238-7571 0000-0002-3088-4965 0000-0002-4933-2896 0000-0002-0022-320X 0000-0002-2638-823X 0000-0002-4014-3626 |
OpenAccessLink | https://www.biorxiv.org/content/10.1101/2021.08.23.457338 |
PQID | 2563946434 |
PQPubID | 2050091 |
PageCount | 15 |
ParticipantIDs | biorxiv_primary_2021_08_23_457338 proquest_journals_2563946434 |
PublicationCentury | 2000 |
PublicationDate | 20210824 |
PublicationDateYYYYMMDD | 2021-08-24 |
PublicationDate_xml | – month: 08 year: 2021 text: 20210824 day: 24 |
PublicationDecade | 2020 |
PublicationPlace | Cold Spring Harbor |
PublicationPlace_xml | – name: Cold Spring Harbor |
PublicationTitle | bioRxiv |
PublicationYear | 2021 |
Publisher | Cold Spring Harbor Laboratory Press Cold Spring Harbor Laboratory |
Publisher_xml | – name: Cold Spring Harbor Laboratory Press – name: Cold Spring Harbor Laboratory |
References | Pollard, Gurdasani, Mentzer, Porter, Sandhu (2021.08.23.457338v1.1) 2018; 27 Shen, Le, Li, Hu (2021.08.23.457338v1.45) 2016; 11 Vaser, Sović, Nagarajan, Šikić (2021.08.23.457338v1.18) 2017; 27 Delmont, Eren (2021.08.23.457338v1.39) 2018; 6 Kolmogorov, Yuan, Lin, Pevzner (2021.08.23.457338v1.20) 2019; 37 Kolmogorov (2021.08.23.457338v1.6) 2020; 17 Haro-Moreno, López-Pérez, Rodríguez-Valera (2021.08.23.457338v1.24) 2020 Beaulaurier (2021.08.23.457338v1.14) 2020; 30 Fu, Wang, Au (2021.08.23.457338v1.2) 2019; 20 Quick (2021.08.23.457338v1.7) 2016; 530 Rooke (2021.08.23.457338v1.8) 2019; 1 Slaby, Hackl, Horn, Bayer, Hentschel (2021.08.23.457338v1.12) 2017 Dohm, Peters, Stralis-Pavese, Himmelbauer (2021.08.23.457338v1.3) 2020; 2 Trigodet (2021.08.23.457338v1.37) 2021 Ruan, Li (2021.08.23.457338v1.5) 2020; 17 Nowoshilow (2021.08.23.457338v1.9) 2018; 554 Hackl, Hedrich, Schultz, Förster (2021.08.23.457338v1.16) 2014; 30 Watson (2021.08.23.457338v1.36) 2021 Chin (2021.08.23.457338v1.15) 2013 Roux, Enault, Hurwitz, Sullivan (2021.08.23.457338v1.47) 2015; 3 Fuhrman (2021.08.23.457338v1.26) 2009; 459 Buchfink, Xie, Huson (2021.08.23.457338v1.23) 2015; 12 Chen, Anantharaman, Shaiber, Eren, Banfield (2021.08.23.457338v1.25) 2020; 30 Palfalvi (2021.08.23.457338v1.10) 2020; 30 McKenzie, Walston, Allen (2021.08.23.457338v1.13) 2020; 112 Steinegger, Mirdita, Söding (2021.08.23.457338v1.46) 2019; 16 Huson (2021.08.23.457338v1.22) 2018; 13 Yang, Chu, Warren, Birol (2021.08.23.457338v1.30) 2017; 6 Hackl (2021.08.23.457338v1.27) 2021 Buchfink, Reuter, Drost (2021.08.23.457338v1.41) 2021; 18 Eren (2021.08.23.457338v1.40) 2021; 6 Hernández-Salmerón, Moreno-Hagelsieb (2021.08.23.457338v1.48) 2020; 21 Hackl, Duponchel, Barenhoff, Weinmann (2021.08.23.457338v1.11) 2020 Liu, Mei, Soltis, Soltis, Barbazuk (2021.08.23.457338v1.4) 2017; 17 Hyatt (2021.08.23.457338v1.31) 2010; 11 Silvestre-Ryan, Holmes (2021.08.23.457338v1.34) 2021; 22 Xiao (2021.08.23.457338v1.17) 2017; 14 Hackl, Ankenbrand (2021.08.23.457338v1.32) 2021 van Dongen, Abreu-Goodger (2021.08.23.457338v1.42) 2012 Bernheim, Sorek (2021.08.23.457338v1.28) 2020; 18 Walker (2021.08.23.457338v1.38) 2014; 9 Arumugam (2021.08.23.457338v1.21) 2019; 7 Li (2021.08.23.457338v1.19) 2018; 34 Vereecke (2021.08.23.457338v1.33) 2020; 21 Hackl (2021.08.23.457338v1.29) 2021 Edgar (2021.08.23.457338v1.43) 2004; 5 Biller (2021.08.23.457338v1.44) 2014; 343 Suzek (2021.08.23.457338v1.35) 2015; 31 |
References_xml | – start-page: 1 year: 2013 end-page: 9 ident: 2021.08.23.457338v1.15 article-title: Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data publication-title: Nat. Methods doi: 10.1038/nmeth.2474 – volume: 17 start-page: 155 year: 2020 end-page: 158 ident: 2021.08.23.457338v1.5 article-title: Fast and accurate long-read assembly with wtdbg2 publication-title: Nat. Methods – volume: 530 start-page: 228 year: 2016 end-page: 232 ident: 2021.08.23.457338v1.7 article-title: Real-time, portable genome sequencing for Ebola surveillance publication-title: Nature – volume: 17 start-page: 1103 year: 2020 end-page: 1110 ident: 2021.08.23.457338v1.6 article-title: metaFlye: scalable long-read metagenome assembly using repeat graphs publication-title: Nat. Methods – volume: 554 start-page: 50 year: 2018 end-page: 55 ident: 2021.08.23.457338v1.9 article-title: The axolotl genome and the evolution of key tissue formation regulators publication-title: Nature – volume: 20 start-page: 26 year: 2019 ident: 2021.08.23.457338v1.2 article-title: A comparative evaluation of hybrid error correction methods for error-prone long reads publication-title: Genome Biol – volume: 6 start-page: e4320 year: 2018 ident: 2021.08.23.457338v1.39 article-title: Linking pangenomes and metagenomes: the Prochlorococcus metapangenome publication-title: PeerJ – volume: 343 start-page: 183 year: 2014 end-page: 186 ident: 2021.08.23.457338v1.44 article-title: Bacterial vesicles in marine ecosystems publication-title: Science – volume: 34 start-page: 3094 year: 2018 end-page: 3100 ident: 2021.08.23.457338v1.19 article-title: Minimap2: pairwise alignment for nucleotide sequences publication-title: Bioinformatics – year: 2021 ident: 2021.08.23.457338v1.27 article-title: Novel Integrative Elements and Genomic Plasticity in Ocean Ecosystems publication-title: Cell preprint doi: 10.2139/ssrn.3817805 – volume: 18 start-page: 113 year: 2020 end-page: 119 ident: 2021.08.23.457338v1.28 article-title: The pan-immune system of bacteria: antiviral defence as a community resource publication-title: Nat. Rev. Microbiol – volume: 30 start-page: 3004 year: 2014 end-page: 3011 ident: 2021.08.23.457338v1.16 article-title: proovread: large-scale high-accuracy PacBio correction through iterative short read consensus publication-title: Bioinformatics – volume: 21 start-page: 517 year: 2020 ident: 2021.08.23.457338v1.33 article-title: High quality genome assemblies of Mycoplasma bovis using a taxon-specific Bonito basecaller for MinION and Flongle long-read nanopore sequencing publication-title: BMC Bioinformatics – volume: 12 start-page: 59 year: 2015 end-page: 60 ident: 2021.08.23.457338v1.23 article-title: Fast and sensitive protein alignment using DIAMOND publication-title: Nat. Methods – year: 2021 ident: 2021.08.23.457338v1.32 publication-title: gggenomes - A grammar of graphics for comparative genomics – year: 2021 ident: 2021.08.23.457338v1.36 article-title: Adaptive ecological processes and metabolic independence drive microbial colonization and resilience in the human gut publication-title: bioRxiv doi: 10.1101/2021.03.02.433653 – volume: 7 start-page: 61 year: 2019 ident: 2021.08.23.457338v1.21 article-title: Annotated bacterial chromosomes from frame-shift-corrected long-read metagenomic data publication-title: Microbiome – volume: 11 start-page: 119 year: 2010 ident: 2021.08.23.457338v1.31 article-title: Prodigal: prokaryotic gene recognition and translation initiation site identification publication-title: BMC Bioinformatics – volume: 459 start-page: 193 year: 2009 end-page: 199 ident: 2021.08.23.457338v1.26 article-title: Microbial community structure and its functional implications publication-title: Nature – volume: 16 start-page: 603 year: 2019 end-page: 606 ident: 2021.08.23.457338v1.46 article-title: Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold publication-title: Nat. Methods – volume: 9 start-page: e112963 year: 2014 ident: 2021.08.23.457338v1.38 article-title: Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement publication-title: PLoS One – start-page: 281 year: 2012 end-page: 295 ident: 2021.08.23.457338v1.42 publication-title: in Bacterial Molecular Networks: Methods and Protocols (eds doi: 10.1007/978-1-61779-361-5_15 – year: 2017 ident: 2021.08.23.457338v1.12 article-title: Metagenomic binning of a marine sponge microbiome reveals unity in defense but metabolic specialization publication-title: ISME J doi: 10.1038/ismej.2017.101 – volume: 27 start-page: R234 year: 2018 end-page: R241 ident: 2021.08.23.457338v1.1 article-title: Long reads: their purpose and place publication-title: Hum. Mol. Genet – volume: 30 start-page: 2312 year: 2020 end-page: 2320 ident: 2021.08.23.457338v1.10 article-title: Genomes of the Venus Flytrap and Close Relatives Unveil the Roots of Plant Carnivory publication-title: Curr. Biol – volume: 1 year: 2019 ident: 2021.08.23.457338v1.8 article-title: Resolving complex mobile genetic elements with nanopore sequencing publication-title: Access Microbiology – volume: 30 start-page: 437 year: 2020 end-page: 446 ident: 2021.08.23.457338v1.14 article-title: Assembly-free single-molecule sequencing recovers complete virus genomes from natural microbial communities publication-title: Genome Res – volume: 18 start-page: 366 year: 2021 end-page: 368 ident: 2021.08.23.457338v1.41 article-title: Sensitive protein alignments at tree-of-life scale using DIAMOND publication-title: Nat. Methods – volume: 37 start-page: 540 year: 2019 end-page: 546 ident: 2021.08.23.457338v1.20 article-title: Assembly of long, error-prone reads using repeat graphs publication-title: Nat. Biotechnol – volume: 13 start-page: 6 year: 2018 ident: 2021.08.23.457338v1.22 article-title: MEGAN-LR: new algorithms allow accurate binning and easy interactive exploration of metagenomic long reads and contigs publication-title: Biol. Direct – year: 2021 ident: 2021.08.23.457338v1.29 publication-title: thackl/proovframe-benchmark: proovframe-benchmark-v3.0 doi: 10.5281/zenodo.5164669 – year: 2021 ident: 2021.08.23.457338v1.37 article-title: High molecular weight DNA extraction strategies for long-read sequencing of complex metagenomes publication-title: bioRxiv doi: 10.1101/2021.03.03.433801 – year: 2020 ident: 2021.08.23.457338v1.11 article-title: Endogenous virophages populate the genomes of a marine heterotrophic flagellate publication-title: bioRxiv – volume: 27 start-page: 737 year: 2017 end-page: 746 ident: 2021.08.23.457338v1.18 article-title: Fast and accurate de novo genome assembly from long uncorrected reads publication-title: Genome Res – volume: 6 start-page: 3 year: 2021 end-page: 6 ident: 2021.08.23.457338v1.40 article-title: Community-led, integrated, reproducible multi-omics with anvi’o publication-title: Nat Microbiol – volume: 112 start-page: 3150 year: 2020 end-page: 3156 ident: 2021.08.23.457338v1.13 article-title: Complete, high-quality genomes from long-read metagenomic sequencing of two wolf lichen thalli reveals enigmatic genome architecture publication-title: Genomics – volume: 5 start-page: 113 year: 2004 ident: 2021.08.23.457338v1.43 article-title: MUSCLE: a multiple sequence alignment method with reduced time and space complexity publication-title: BMC Bioinformatics – volume: 17 start-page: 1243 year: 2017 end-page: 1256 ident: 2021.08.23.457338v1.4 article-title: Detecting alternatively spliced transcript isoforms from single-molecule long-read sequences without a reference genome publication-title: Mol. Ecol. Resour – volume: 3 start-page: e985 year: 2015 ident: 2021.08.23.457338v1.47 article-title: VirSorter: mining viral signal from microbial genomic data publication-title: PeerJ – volume: 2 year: 2020 ident: 2021.08.23.457338v1.3 article-title: Benchmarking of long-read correction methods publication-title: NAR Genomics and Bioinformatics – volume: 14 start-page: 1072 year: 2017 end-page: 1074 ident: 2021.08.23.457338v1.17 article-title: MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads publication-title: Nat. Methods – volume: 30 start-page: 315 year: 2020 end-page: 333 ident: 2021.08.23.457338v1.25 article-title: Accurate and complete genomes from metagenomes publication-title: Genome Res – volume: 31 start-page: 926 year: 2015 end-page: 932 ident: 2021.08.23.457338v1.35 article-title: UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches publication-title: Bioinformatics – volume: 11 start-page: e0163962 year: 2016 ident: 2021.08.23.457338v1.45 article-title: SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation publication-title: PLoS One – volume: 21 year: 2020 ident: 2021.08.23.457338v1.48 article-title: Progress in quickly finding orthologs as reciprocal best hits: comparing blast, last, diamond and MMseqs2 publication-title: BMC Genomics – year: 2020 ident: 2021.08.23.457338v1.24 article-title: Long read metagenomics, the next step? publication-title: Cold Spring Harbor Laboratory doi: 10.1101/2020.11.11.378109 – volume: 6 start-page: 1 year: 2017 end-page: 6 ident: 2021.08.23.457338v1.30 article-title: NanoSim: nanopore sequence read simulator based on statistical characterization publication-title: Gigascience – volume: 22 start-page: 38 year: 2021 ident: 2021.08.23.457338v1.34 article-title: Pair consensus decoding improves accuracy of neural network basecallers for nanopore sequencing publication-title: Genome Biol |
SSID | ssj0002961374 |
Score | 1.6332521 |
SecondaryResourceType | preprint |
Snippet | Long-read sequencing technologies hold big promises for the genomic analysis of complex samples such as microbial communities. Yet, despite improving accuracy,... |
SourceID | biorxiv proquest |
SourceType | Open Access Repository Aggregation Database |
SubjectTerms | Bioinformatics Genomic analysis Proteins |
SummonAdditionalLinks | – databaseName: ProQuest Central dbid: BENPR link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1NS8NAEF20RfDmJ1arRPCgh9Vkv7LxoKC0FMFSxEJvYTeZ1YJtaluL_ntn01QPgqccEhYyu3nz5u3mDSFn0irlJKJfmFigQgmgxkmgscsSrsMsSjJfKD52VacvHgZyUAlus-pY5QoTS6DOi8xr5FeYmnmCY3FxO3mnvmuU312tWmiskzpCsNY1Ur9rdXtPPyoLSzBdlVbMTCX46bNQVlubuBR94V8aeDJ-KbwvoEYSbIfF9HO4-APNZb5pb5F6z0xguk3WYLxDNpYNI792yQ0-XSycP1B1HZSX2evQzWnme2yUfygESEKDt2L8QpEN5sH5CObmwhuxjobZbI_0263n-w6tOiBQi2WSptKLDSzWoVPCRTk4JDugEhPGFpkxB-98E3OwClgsrQVEtihnjmcKOTMIx_dJbVyM4YAEubJKZkZLZiIhE2OMdIaB1pFBEuOgQU6rV08nS5-L1IcnDXXKeLoMT4M0V0FJq6U-S38n5vD_20dk04_oBVkmmqQ2n37AMWb0uT2ppu0bqBKcfA priority: 102 providerName: ProQuest |
Title | proovframe: frameshift-correction for long-read (meta)genomics |
URI | https://www.proquest.com/docview/2563946434 https://www.biorxiv.org/content/10.1101/2021.08.23.457338 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dS8MwEA-6IfjmJ37MUcEHfeho89XUBx-UjSE4hjjYW0naiw5cO7Y69L_30lYR9MGnQElaeknufne5_I6QC2GktAK1XxAb8Lnk4GsrwI9sGjMVpGGcOkfxYSSHE34_FdMfpb5cWqWZFcv32bo6x3cJ26h9680dhM5Xrzg3KetxR-WnNkkblxR3e3Iw7X2HV2iMdirizTnmnyMR8TZf-qWHK-My2CHtsV7AcpdsQL5HturqkB_75AZ7F2vrsqeuvapZvcxs6aeuoEZ1HcFDxOm9Fvmzj9Av8y7nUOorx7o6n6WrAzIZ9J_uhn5T7sA36BMpX7jIAo1UYCW3YQYWkQ3IWAeRQRjMwNHcRAyMBBoJYwDVWJhRy1KJABm4ZYeklRc5HBEvk0aKVCtBdchFrLUWVlNQKtSIWCwck_Pm15NFTWqROPEkgUooS2rxHJPOl1CSZl2vEgRILMYZZfzkH684JdvumQvBUt4hrXL5Bmdow0vTJe3b_mj82K1m7RP3upao |
linkProvider | Cold Spring Harbor Laboratory Press |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9wwEB7Briq4QQF1eTVIVKIHt4lfSZCgUlvQUmCFKpC4BTsZw0qwWXaX15_iNzLOZsuhEjdOOTgayePxzDdj-xuATWW1doq8X5haZFJLZMYpZLHLU5GEeZTmPlE87uj2mfxzrs6n4HnyFsZfq5z4xMpRF2Xua-TfKTSLlGQJ-aN_y3zXKH-6OmmhMTaLQ3x6oJRtuHPwm9b3C-f7e6e_2qzuKsAspR4JUz6B53ESOi1dVKAjAIE6NWFsCW0K9GwysUCrkcfKWiRvERXciVwTDkXpBMmdhqYUOuQNaP7c65z8_VfV4SmFx4r6meuUXA0PVX2USqbvCw0VYSgX36TnIUwIdNtuOXjs3v8XCqr4tj8HzRPTx8E8TGHvI3wYN6h8WoBd-ru8d_4C13ZQfYZXXTdiue_pUb2ICAj0Btdl75IR-iyCrRscma-e-PWmmw8X4exddLMEjV7Zw08QFNpqlZtEcRNJlRpjlDMckyQyBJoctmCjnnrWH_NqZF49WZhkXGRj9bRgdaKUrN5aw-zVEJbfHv4MM-3T46Ps6KBzuAKzXrovBnO5Co3R4A7XCE2M7Hq9hAFcvLfVvACGhNch |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT8MwDLYGCMSNp3hTJJDg0KnNqy0SXBgTMJg4gLRbSVoHJsE2beOxv8UvxOkKFzhw2amHVlHjOPZnx_kMsC-NUlaS9QsSg75QAn1tJfqRzRIeB1mYZC5QvGmqi3tx1ZKtCnx-34VxZZWm3e1_tN-Kc3xXsE3Wd7y5g9DF6gXnJuNV4aj84qpLU1d7uS0LKxs4eqewbXByWaM1PmCsfn53duGXnQV8Q-FH7EsXxLMoDqwSNszREohAleggMoQ4OTpGmYijUcgiaQySxQhzZnmmCIuisJzGnYIZ0mXh2kXUW9WfvA5LyEFGojxA_fOXCWqXU_zlAAqvVl-AmVvdw_4iVLCzBLPjtpSjZTilr7tv1pVtHXvFY_DUtkM_c508insQHkFd77nbefQJc-be4QsO9ZGje31pZ4MVuJ-INFZhutPt4Bp4uTJKZjqWTIdCJlpraTXDOA41QSWL67BXTj3tjdk0UieeNIhTxtOxeNZh61soabmhBikhM56QKnGx8Y8hdmHutlZPry-bjU2Yd69dGpiJLZge9l9xm3DE0OwUC-fBw6Q15QuD79LH |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=proovframe%3A+frameshift-correction+for+long-read+%28meta%29genomics&rft.jtitle=bioRxiv&rft.au=Hackl%2C+Thomas&rft.au=Trigodet%2C+Florian&rft.au=Eren%2C+A.+Murat&rft.au=Biller%2C+Steven+J.&rft.date=2021-08-24&rft.pub=Cold+Spring+Harbor+Laboratory&rft.eissn=2692-8205&rft_id=info:doi/10.1101%2F2021.08.23.457338&rft.externalDocID=2021.08.23.457338v1 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2692-8205&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2692-8205&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2692-8205&client=summon |