Association rule based frequent pattern mining in biological sequences

To find all frequent patterns present in a set of strings is computationally intensive. An exhaustive search, where every possible candidate is taken into consideration, is not practical for larger pattern widths due to exponential computational complexity. Other approaches apply heuristics, where a...

Full description

Saved in:

Bibliographic Details
Published in	2013 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) pp. 1 - 5
Main Authors	Salim, A., Chandra, S. S. Vinod
Format	Conference Proceeding
Language	English
Published	IEEE 01.12.2013
Subjects	Algorithm design and analysis Apriori Bioinformatics Generators Genomic Sequences Genomics Most Frequent Pattern Pattern matching
Online Access	Get full text

Cover

Loading…

More Information
Summary:	To find all frequent patterns present in a set of strings is computationally intensive. An exhaustive search, where every possible candidate is taken into consideration, is not practical for larger pattern widths due to exponential computational complexity. Other approaches apply heuristics, where algorithm tries to reduce search space, but may compromise the accuracy of results to certain extent. We used modified Apriori algorithm to mine possible patterns in a very long sequence, especially most frequent substring pattern of a fixed length in biological sequence. The algorithm gives good performance by rapid reduction in search space, and computations using bit-wise operations instead of expensive string comparison operations. This algorithm outperform existing pattern finding methods such as MEME in terms of execution time.
ISBN:	1479915947 9781479915941
DOI:	10.1109/ICCIC.2013.6724203