Performance analysis of parallel de novo genome assembly in shared memory system
De novo genome assembly is computationally intensive tasks in genome analysis, where it builds the whole genome from small fragments (reads) generated by next-generation sequencing (NGS) platform. Parallel processing is a method to reduce the time complexity. In this work, we analyze the performance...
Saved in:
Published in | IOP conference series. Earth and environmental science Vol. 187; no. 1; pp. 12032 - 12040 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Bristol
IOP Publishing
19.11.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | De novo genome assembly is computationally intensive tasks in genome analysis, where it builds the whole genome from small fragments (reads) generated by next-generation sequencing (NGS) platform. Parallel processing is a method to reduce the time complexity. In this work, we analyze the performance of three popular de novo genome assembly tool based on de Bruijn graph i.e., Velvet, SOAPdenovo2, and ABySS in a parallel environment. Simulated and real genome datasets from several species are used in this study. We determine the performance using two criteria, including the quality of contigs produced and the parallel performance. For the quality of contigs produced, we measure the N50 size, the number of contigs, and maximum contigs length. As for the parallel performance, we measure the speedup of the use of multi-core CPU in a shared memory system. Lastly, memory usage for each tool also compared. Based on the experiment, SOAPdenovo2 have the best performance for the quality of contigs produced with highest N50 value. All assembly tool work well in the parallel environment and give the speedup significantly. SOAPdenovo2 is the best tool that gives 22 times super-linear speedup. As for memory usage, ABySS is the most efficient one. |
---|---|
ISSN: | 1755-1307 1755-1315 1755-1315 |
DOI: | 10.1088/1755-1315/187/1/012032 |