Fulcrum: condensing redundant reads from high-throughput sequencing studies
Motivation: Ultra-high-throughput sequencing produces duplicate and near-duplicate reads, which can consume computational resources in downstream applications. A tool that collapses such reads should reduce storage and assembly complications and costs. Results: We developed Fulcrum to collapse ident...
Saved in:
Published in | Bioinformatics (Oxford, England) Vol. 28; no. 10; pp. 1324 - 1327 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Oxford
Oxford University Press
15.05.2012
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Motivation: Ultra-high-throughput sequencing produces duplicate and near-duplicate reads, which can consume computational resources in downstream applications. A tool that collapses such reads should reduce storage and assembly complications and costs.
Results: We developed Fulcrum to collapse identical and near-identical Illumina and 454 reads (such as those from PCR clones) into single error-corrected sequences; it can process paired-end as well as single-end reads. Fulcrum is customizable and can be deployed on a single machine, a local network or a commercially available MapReduce cluster, and it has been optimized to maximize ease-of-use, cross-platform compatibility and future scalability. Sequence datasets have been collapsed by up to 71%, and the reduced number and improved quality of the resulting sequences allow assemblers to produce longer contigs while using less memory.
Availability and implementation: Source code and a tutorial are available at http://pringlelab.stanford.edu/protocols.html under a BSD-like license. Fulcrum was written and tested in Python 2.6, and the single-machine and local-network modes depend on a modified version of the Parallel Python library (provided).
Contact: erik.m.lehnert@gmail.com
Supplementary information: Supplementary information is available at Bioinformatics online. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Associate Editor: Alex Bateman |
ISSN: | 1367-4803 1367-4811 1367-4811 |
DOI: | 10.1093/bioinformatics/bts123 |