A Novel Real-Time Genome Comparison Method Using Discrete Wavelet Transform

Real-time genome comparison is important for identifying unknown species and clustering organisms. We propose a novel method that can represent genome sequences of different lengths as a 12-dimensional numerical vector in real time for this purpose. Given a genome sequence, a binary indicator sequen...

Full description

Saved in:
Bibliographic Details
Published inJournal of computational biology Vol. 25; no. 4; p. 405
Main Authors Huang, Hsin-Hsiung, Girimurugan, Senthil B
Format Journal Article
LanguageEnglish
Published United States 01.04.2018
Subjects
Online AccessGet more information

Cover

Loading…
More Information
Summary:Real-time genome comparison is important for identifying unknown species and clustering organisms. We propose a novel method that can represent genome sequences of different lengths as a 12-dimensional numerical vector in real time for this purpose. Given a genome sequence, a binary indicator sequence of each nucleotide base location is computed, and then discrete wavelet transform is applied to these four binary indicator sequences to attain the respective power spectra. Afterward, moments of the power spectra are calculated. Consequently, the 12-dimensional numerical vectors are constructed from the first three order moments. Our experimental results on various data sets show that the proposed method is efficient and effective to cluster genes and genomes. It runs significantly faster than other alignment-free and alignment-based methods.
ISSN:1557-8666
DOI:10.1089/cmb.2017.0115