LRTK: A unified and versatile toolkit for analyzing linked-read sequencing data
Linked-read sequencing technologies offering reads with both high base quality and long-range DNA connectedness have shown great success in genomic studies. The mainstream platforms include 10x Genomics linked-read (10x), Single Tube Long Fragment Read (stLFR) and Transposase Enzyme-Linked Long-read...
Saved in:
Published in | bioRxiv |
---|---|
Main Authors | , , , |
Format | Paper |
Language | English |
Published |
Cold Spring Harbor
Cold Spring Harbor Laboratory Press
13.08.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Linked-read sequencing technologies offering reads with both high base quality and long-range DNA connectedness have shown great success in genomic studies. The mainstream platforms include 10x Genomics linked-read (10x), Single Tube Long Fragment Read (stLFR) and Transposase Enzyme-Linked Long-read Sequencing (TELL-Seq). The existing data analysis pipelines, e.g., Long Ranger, have been developed to process sequencing data from particular platforms and so are unable to fully utilize the unique characteristics of other platforms; thus, users have limited tools to choose for downstream analysis. To address these limitations, we present Linked-Read ToolKit (LRTK), a unified and versatile toolkit to process linked-read sequencing data from different platforms. LRTK provides flexible functions to perform data simulation, format conversion, data preprocessing, barcode-aware read alignment, variant calling and phasing. It also allows multi-sample batch processing and generates a HTML report with key statistics and plots. We applied LRTK to the linked-read data of NA24385 obtained from all three platforms, where the results showed the advancement of LRTK in structural variation recall rate for 10x linked-reads and in increasing phase block N50 for 10x and stLFR linked-reads. Availability: Source codes are available at https://github.com/ericcombiolab/LRTK. Anaconda supports the installation of LRTK and its dependencies. Contact: ericluzhang@hkbu.edu.hk Supplementary information: Supplementary data are available at Bioinformatics online. Competing Interest Statement The authors have declared no competing interest. |
---|---|
AbstractList | Linked-read sequencing technologies offering reads with both high base quality and long-range DNA connectedness have shown great success in genomic studies. The mainstream platforms include 10x Genomics linked-read (10x), Single Tube Long Fragment Read (stLFR) and Transposase Enzyme-Linked Long-read Sequencing (TELL-Seq). The existing data analysis pipelines, e.g., Long Ranger, have been developed to process sequencing data from particular platforms and so are unable to fully utilize the unique characteristics of other platforms; thus, users have limited tools to choose for downstream analysis. To address these limitations, we present Linked-Read ToolKit (LRTK), a unified and versatile toolkit to process linked-read sequencing data from different platforms. LRTK provides flexible functions to perform data simulation, format conversion, data preprocessing, barcode-aware read alignment, variant calling and phasing. It also allows multi-sample batch processing and generates a HTML report with key statistics and plots. We applied LRTK to the linked-read data of NA24385 obtained from all three platforms, where the results showed the advancement of LRTK in structural variation recall rate for 10x linked-reads and in increasing phase block N50 for 10x and stLFR linked-reads. Availability: Source codes are available at https://github.com/ericcombiolab/LRTK. Anaconda supports the installation of LRTK and its dependencies. Contact: ericluzhang@hkbu.edu.hk Supplementary information: Supplementary data are available at Bioinformatics online. Competing Interest Statement The authors have declared no competing interest. |
Author | Liao, Herui Zhang, Lu Zhang, Zhenmiao Yang, Chao |
Author_xml | – sequence: 1 givenname: Chao surname: Yang fullname: Yang, Chao – sequence: 2 givenname: Zhenmiao surname: Zhang fullname: Zhang, Zhenmiao – sequence: 3 givenname: Herui surname: Liao fullname: Liao, Herui – sequence: 4 givenname: Lu surname: Zhang fullname: Zhang, Lu |
BookMark | eNqNjMsKwjAURLPQha8PcHfBdWuS2lrciSiCgiDdl2BuJTYkmqSCfr0R_ABXw8w5zJD0jDVIyJTRlDHK5pxyntIyjT2n2SIvB-R0PFeHFayhM6pRKEEYCU90XgSlEYK1ulUBGusiEfr1VuYKWpkWZeJQSPD46NBcvrMUQYxJvxHa4-SXIzLbbavNPrk7G0Uf6pvtXHzyNV9SVjBWZDT7z_oA82k_hA |
ContentType | Paper |
Copyright | 2022. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2022. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | 8FE 8FH AAFGM AAMXL ABOIG ABUWG ADZZV AFKRA AFLLJ AFOLM AGAJT AQTIP AZQEC BBNVY BENPR BHPHI CCPQU DWQXO GNUQQ HCIFZ LK8 M7P PIMPY PQCXX PQEST PQQKQ PQUKI PRINS |
DOI | 10.1101/2022.08.10.503458 |
DatabaseName | ProQuest SciTech Collection ProQuest Natural Science Collection ProQuest Central Korea - hybrid linking Natural Science Collection - hybrid linking Biological Science Collection - hybrid linking ProQuest Central (Alumni) ProQuest Central (Alumni) - hybrid linking ProQuest Central SciTech Premium Collection - hybrid linking ProQuest Central Student - hybrid linking ProQuest Central Essentials - hybrid linking ProQuest Women's & Gender Studies - hybrid linking ProQuest Central Essentials Biological Science Collection ProQuest Central Natural Science Collection ProQuest One Community College ProQuest Central Korea ProQuest Central Student SciTech Premium Collection Biological Sciences Biological Science Database Publicly Available Content (ProQuest) ProQuest Central - hybrid linking ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China |
DatabaseTitle | Publicly Available Content Database ProQuest Central Student ProQuest Biological Science Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Natural Science Collection Biological Science Database ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest One Academic UKI Edition Natural Science Collection ProQuest Central Korea Biological Science Collection ProQuest One Academic |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FH ABUWG AFKRA AZQEC BBNVY BENPR BHPHI CCPQU DWQXO GNUQQ HCIFZ LK8 M7P PIMPY PQEST PQQKQ PQUKI PRINS |
ID | FETCH-proquest_journals_27016116303 |
IEDL.DBID | BENPR |
IngestDate | Thu Oct 10 18:19:11 EDT 2024 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-proquest_journals_27016116303 |
OpenAccessLink | https://www.proquest.com/docview/2701611630?pq-origsite=%requestingapplication% |
PQID | 2701611630 |
PQPubID | 2050091 |
ParticipantIDs | proquest_journals_2701611630 |
PublicationCentury | 2000 |
PublicationDate | 20220813 |
PublicationDateYYYYMMDD | 2022-08-13 |
PublicationDate_xml | – month: 08 year: 2022 text: 20220813 day: 13 |
PublicationDecade | 2020 |
PublicationPlace | Cold Spring Harbor |
PublicationPlace_xml | – name: Cold Spring Harbor |
PublicationTitle | bioRxiv |
PublicationYear | 2022 |
Publisher | Cold Spring Harbor Laboratory Press |
Publisher_xml | – name: Cold Spring Harbor Laboratory Press |
Score | 3.410634 |
Snippet | Linked-read sequencing technologies offering reads with both high base quality and long-range DNA connectedness have shown great success in genomic studies.... |
SourceID | proquest |
SourceType | Aggregation Database |
SubjectTerms | Batch processing Bioinformatics Data processing Genomics Statistical analysis Transposase |
Title | LRTK: A unified and versatile toolkit for analyzing linked-read sequencing data |
URI | https://www.proquest.com/docview/2701611630 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LSwMxEB509-LNouKjSkCvwex740WqtBSVWkqF3srmsVBcurXdHvTXO7OkeBB6TAYSMkxmvnkwA3Cn8xK_mBVcllrx2OaWI05WPMujQBWhNFa13T5H6fAjfpklMxdw27iyyp1ObBW1qTXFyO_DjMAJogfxuPriNDWKsqtuhMYh-CF6CsID_6k_Gk9c-hLFjZz7kJp04joRUZzk_5Rua0kGx-CPi5Vdd-DALk_g_W0yfX1gPbZdLkoEgwz9ekaFEsivyrKmrqvPRcMQWCKlqL5_0NAwyrlawxHtGeYqoWmbaj1P4XbQnz4P-e7uuZOVzfzvZdEZeOj023NgqUboHpSFov59SuYyk0IboYsyDNLUyAvo7jvpcj_5Co6ILxQbDaIueM16a6_RuDbqxnHwF23vgH4 |
link.rule.ids | 783,787,21400,27937,33756,43817 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LS8NAEB60OehNUfFRdUGvi3k_vIhKS7QxlhKht5DNbqAYmtqmB_31zoQtHoQedwd22WF25psHMwC3ZVjhF1Mmj6pScFeFiiNOFjwIHUsUdiSV6Lp9pn784b5OvakOuK10WeVGJ3aKWjYlxcjv7IDACaIH82HxxWlqFGVX9QiNXTCoVRU6X8bTIB1PdPoSxY2ce5uadOLaMx3XC_8p3c6SDA_AGBcLtTyEHTU_gvdkko3u2SNbz2cVgkGGfj2jQgnkV61Y2zT156xlCCyRUtTfP2hoGOVcleSI9iTTldC0TbWex3AzHGTPMd_cnWtZWeV_L3NOoIdOvzoF5pcI3a2qENS_T0RhFERmKc2yqGzL92V0Bv1tJ51vJ1_DXpy9JXnyko4uYJ94RHFSy-lDr12u1SUa2lZcaW7-Aqjqg3g |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=LRTK%3A+A+unified+and+versatile+toolkit+for+analyzing+linked-read+sequencing+data&rft.jtitle=bioRxiv&rft.au=Yang%2C+Chao&rft.au=Zhang%2C+Zhenmiao&rft.au=Liao%2C+Herui&rft.au=Zhang%2C+Lu&rft.date=2022-08-13&rft.pub=Cold+Spring+Harbor+Laboratory+Press&rft_id=info:doi/10.1101%2F2022.08.10.503458 |