Segmentation of Time Series Based on Kinetic Characteristics for Storage Consumption Prediction

The Internet services generate huge amount of data, which require large space for storage. Determining device purchase plan turns out to be very important for the service providers. Under-purchasing might lead to data loss, while over-purchasing would result in waste. In this paper, we propose a lin...

Full description

Saved in:
Bibliographic Details
Published in2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS) pp. 2559 - 2560
Main Authors Beibei Miao, Yu Chen, Xuebo Jin, Bo Wang, Xianping Qu, Shimin Tao, Dong Wang, Zhi Zang
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.06.2017
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The Internet services generate huge amount of data, which require large space for storage. Determining device purchase plan turns out to be very important for the service providers. Under-purchasing might lead to data loss, while over-purchasing would result in waste. In this paper, we propose a linear regression based approach to predict the storage demand according to the time series of the storage consumption. We partitioned the storage con-sumption time series into several linear segments, and perform prediction on the last segment using linear regression. Since the position of turning points between adjacent segments and the total number of the segments are both unknown, how to achieve the online segmentation becomes a big challenge. Aiming to solve this problem, we carried out the Kalman-Anova segmentation method. Experiment results show that our method has good accuracy in precision, recall and F-measure values. Moreover, the method is able to segment nonlinear time series as well, suggesting a potential wider application. The proposed method has been deployed in Baidu Inc. and saves about 45 thousand dollars in one of its device purchase program.
ISSN:1063-6927
2575-8411
DOI:10.1109/ICDCS.2017.254