SP-TSRM: A Data Grouping Strategy in Distributed Storage System
With the development of smart devices and social media, massive unstructured data is uploaded to distributed storage systems. Since the characteristics of multi-users and high concurrency the unstructured data accesses have, it brings new challenges to traditional distributed storage systems designe...
Saved in:
Published in | Algorithms and Architectures for Parallel Processing Vol. 11334; pp. 524 - 531 |
---|---|
Main Authors | , , , , |
Format | Book Chapter |
Language | English |
Published |
Switzerland
Springer International Publishing AG
2018
Springer International Publishing |
Series | Lecture Notes in Computer Science |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | With the development of smart devices and social media, massive unstructured data is uploaded to distributed storage systems. Since the characteristics of multi-users and high concurrency the unstructured data accesses have, it brings new challenges to traditional distributed storage systems designed for large files. We propose a grouping strategy to analyze relevant data in access according to disk access logs in the real distributed storage systems environment. When any data in the group is accessed, the whole group is prefetched from disk to the cache. Firstly, we conduct statistical analysis on the access logs and propose a preliminary classification method to classify files in spatiotemporal locality. Secondly, a strength-priority tree structure relation model (SP-TSRM) is proposed to mine file group efficiently. Finally, experiments show that the proposed model can improve the cache hit rate significantly, thereby improving the read efficiency of distributed storage systems. |
---|---|
ISBN: | 3030050505 9783030050504 |
ISSN: | 0302-9743 1611-3349 |
DOI: | 10.1007/978-3-030-05051-1_36 |