A de novo metagenomic assembly program for shotgun DNA reads
Motivation: A high-quality assembly of reads generated from shotgun sequencing is a substantial step in metagenome projects. Although traditional assemblers have been employed in initial analysis of metagenomes, they cannot surmount the challenges created by the features of metagenomic data. Result:...
Saved in:
Published in | Bioinformatics (Oxford, England) Vol. 28; no. 11; pp. 1455 - 1462 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
Oxford
Oxford University Press
01.06.2012
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Motivation: A high-quality assembly of reads generated from shotgun sequencing is a substantial step in metagenome projects. Although traditional assemblers have been employed in initial analysis of metagenomes, they cannot surmount the challenges created by the features of metagenomic data.
Result: We present a de novo assembly approach and its implementation named MAP (metagenomic assembly program). Based on an improved overlap/layout/consensus (OLC) strategy incorporated with several special algorithms, MAP uses the mate pair information, resulting in being more applicable to shotgun DNA reads (recommended as >200 bp) currently widely used in metagenome projects. Results of extensive tests on simulated data show that MAP can be superior to both Celera and Phrap for typical longer reads by Sanger sequencing, as well as has an evident advantage over Celera, Newbler and the newest Genovo, for typical shorter reads by 454 sequencing.
Availability and implementation: The source code of MAP is distributed as open source under the GNU GPL license, the MAP program and all simulated datasets can be freely available at http://bioinfo.ctb.pku.edu.cn/MAP/
Contact: hqzhu@pku.edu.cn
Supplementary information: Supplementary data are available at Bioinformatics online. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
ISSN: | 1367-4803 1367-4811 1367-4811 |
DOI: | 10.1093/bioinformatics/bts162 |