DNA Sequence Splicing Algorithm Based on Spark

Bioinformatics is a cross subject of biological information processing. DNA sequence splicing is one of its research content. At present, most parallel algorithms are based on the operating environment of MapReduce. There is a complex process for reading and writing to hard disk, which lead to infer...

Full description

Saved in:

Bibliographic Details
Published in	2016 International Conference on Industrial Informatics - Computing Technology, Intelligent Technology, Industrial Information Integration (ICIICII) pp. 52 - 56
Main Authors	Xu Pan, Xue-Liang Fu, Gai-Fang Dong, Hong-Hui Li
Format	Conference Proceeding
Language	English
Published	IEEE 01.12.2016
Subjects	Algorithm design and analysis Bioinformatics Computers DNA DNA sequence splicing Hard disks parallel algorithm Parallel algorithms Spark Sparks Splicing
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Bioinformatics is a cross subject of biological information processing. DNA sequence splicing is one of its research content. At present, most parallel algorithms are based on the operating environment of MapReduce. There is a complex process for reading and writing to hard disk, which lead to inferiority that the speed of the algorithm will be slow. In this paper, Spark calculation model based on memory is proposed to solve the problem. At the same time, a new method of matching K-2 bit will be also used by us. Results of experiment show that the running environment based on Spark and the method can ensure accuracy of stitching results and make the algorithm more efficient.
DOI:	10.1109/ICIICII.2016.0024