BulkVis: a graphical viewer for Oxford nanopore bulk FAST5 files

The Oxford Nanopore Technologies (ONT) MinION is used for sequencing a wide variety of sample types with diverse methods of sample extraction. Nanopore sequencers output FAST5 files containing signal data subsequently base called to FASTQ format. Optionally, ONT devices can collect data from all seq...

Full description

Saved in:
Bibliographic Details
Published inBioinformatics (Oxford, England) Vol. 35; no. 13; pp. 2193 - 2198
Main Authors Payne, Alexander, Holmes, Nadine, Rakyan, Vardhman, Loose, Matthew
Format Journal Article
LanguageEnglish
Published England Oxford University Press 01.07.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The Oxford Nanopore Technologies (ONT) MinION is used for sequencing a wide variety of sample types with diverse methods of sample extraction. Nanopore sequencers output FAST5 files containing signal data subsequently base called to FASTQ format. Optionally, ONT devices can collect data from all sequencing channels simultaneously in a bulk FAST5 file enabling inspection of signal in any channel at any point. We sought to visualize this signal to inspect challenging or difficult to sequence samples. The BulkVis tool can load a bulk FAST5 file and overlays MinKNOW (the software that controls ONT sequencers) classifications on the signal trace and can show mappings to a reference. Users can navigate to a channel and time or, given a FASTQ header from a read, jump to its specific position. BulkVis can export regions as Nanopore base caller compatible reads. Using BulkVis, we find long reads can be incorrectly divided by MinKNOW resulting in single DNA molecules being split into two or more reads. The longest seen to date is 2 272 580 bases in length and reported in eleven consecutive reads. We provide helper scripts that identify and reconstruct split reads given a sequencing summary file and alignment to a reference. We note that incorrect read splitting appears to vary according to input sample type and is more common in 'ultra-long' read preparations. The software is available freely under an MIT license at https://github.com/LooseLab/bulkvis. Supplementary data are available at Bioinformatics online.
ISSN:1367-4803
1367-4811
DOI:10.1093/bioinformatics/bty841