LRTK: A unified and versatile toolkit for analyzing linked-read sequencing data

Linked-read sequencing technologies offering reads with both high base quality and long-range DNA connectedness have shown great success in genomic studies. The mainstream platforms include 10x Genomics linked-read (10x), Single Tube Long Fragment Read (stLFR) and Transposase Enzyme-Linked Long-read...

Full description

Saved in:
Bibliographic Details
Published inbioRxiv
Main Authors Yang, Chao, Zhang, Zhenmiao, Liao, Herui, Zhang, Lu
Format Paper
LanguageEnglish
Published Cold Spring Harbor Cold Spring Harbor Laboratory Press 13.08.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Linked-read sequencing technologies offering reads with both high base quality and long-range DNA connectedness have shown great success in genomic studies. The mainstream platforms include 10x Genomics linked-read (10x), Single Tube Long Fragment Read (stLFR) and Transposase Enzyme-Linked Long-read Sequencing (TELL-Seq). The existing data analysis pipelines, e.g., Long Ranger, have been developed to process sequencing data from particular platforms and so are unable to fully utilize the unique characteristics of other platforms; thus, users have limited tools to choose for downstream analysis. To address these limitations, we present Linked-Read ToolKit (LRTK), a unified and versatile toolkit to process linked-read sequencing data from different platforms. LRTK provides flexible functions to perform data simulation, format conversion, data preprocessing, barcode-aware read alignment, variant calling and phasing. It also allows multi-sample batch processing and generates a HTML report with key statistics and plots. We applied LRTK to the linked-read data of NA24385 obtained from all three platforms, where the results showed the advancement of LRTK in structural variation recall rate for 10x linked-reads and in increasing phase block N50 for 10x and stLFR linked-reads. Availability: Source codes are available at https://github.com/ericcombiolab/LRTK. Anaconda supports the installation of LRTK and its dependencies. Contact: ericluzhang@hkbu.edu.hk Supplementary information: Supplementary data are available at Bioinformatics online. Competing Interest Statement The authors have declared no competing interest.
DOI:10.1101/2022.08.10.503458