Parallelization of searching and mining time series data using Dynamic Time Warping

Among the various algorithms present for data mining, the UCR Dynamic Time Warping (DTW) suite provided a solution to search and mine large data sets of time series data more efficiently as compared to the previously existing method of using Euclidean Distance. The UCR DTW algorithm was developed fo...

Full description

Saved in:

Bibliographic Details
Published in	2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI) pp. 343 - 348
Main Authors	Shabib, Ahmed, Narang, Anish, Niddodi, Chaitra Prasad, Das, Madhura, Pradeep, Rachita, Shenoy, Varun, Auradkar, Prafullata, Vignesh, T. S., Sitaram, Dinkar
Format	Conference Proceeding
Language	English
Published	IEEE 01.08.2015
Subjects	Algorithm design and analysis Clustering algorithms Dynamic time warping Instruction sets Multicore Multicore processing Random access memory Spark Sparks Time series Time series analysis
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Among the various algorithms present for data mining, the UCR Dynamic Time Warping (DTW) suite provided a solution to search and mine large data sets of time series data more efficiently as compared to the previously existing method of using Euclidean Distance. The UCR DTW algorithm was developed for a single CPU core. In this paper, we consider 2 methods of parallelizing the DTW algorithm. First, we consider a multi-core implementation, followed by a cluster implementation using Spark. From the multi-core implementation, we achieve nearly linear speedup. In the Spark implementation, we find that a straightforward implementation of DTW does not perform well. This is because; a major step in DTW is parallel computation of a lower bound. This paradigm is not supported well by Spark, which supports (i) broadcast variables that are broadcasts of read-only variables (ii) accumulation variables that represent distributed sums. We show how to compute distributed lower bounds efficiently in Spark and achieve nearly linear speedup with DTW in a Spark computation as well.
ISBN:	9781479987900 1479987905
DOI:	10.1109/ICACCI.2015.7275633