Evaluation of haplotype-aware long-read error correction with hifieval

Abstract Summary The PacBio High-Fidelity (HiFi) sequencing technology produces long reads of >99% in accuracy. It has enabled the development of a new generation of de novo sequence assemblers, which all have sequencing error correction (EC) as the first step. As HiFi is a new data type, this cr...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics (Oxford, England) Vol. 39; no. 10
Main Authors Guo, Yujie, Feng, Xiaowen, Li, Heng
Format Journal Article
LanguageEnglish
Published England Oxford University Press 03.10.2023
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract Summary The PacBio High-Fidelity (HiFi) sequencing technology produces long reads of >99% in accuracy. It has enabled the development of a new generation of de novo sequence assemblers, which all have sequencing error correction (EC) as the first step. As HiFi is a new data type, this critical step has not been evaluated before. Here, we introduced hifieval, a new command-line tool for measuring over- and under-corrections produced by EC algorithms. We assessed the accuracy of the EC components of existing HiFi assemblers on the CHM13 and the HG002 datasets and further investigated the performance of EC methods in challenging regions such as homopolymer regions, centromeric regions, and segmental duplications. Hifieval will help HiFi assemblers to improve EC and assembly quality in the long run. Availability and implementation The source code is available at https://github.com/magspho/hifieval.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1367-4811
1367-4803
1367-4811
DOI:10.1093/bioinformatics/btad631