A fast deduplication scheme for stored data in distributed storage systems
Data deduplication can effectively reduce data redundancy. However, its write performance is insufficient for existing storage systems, due to the additional calculation and I/O operations. In order to improve the deduplication speed in a distributed storage system, we propose FastDedup, a fast and...
Saved in:
Main Authors | , |
---|---|
Format | Conference Proceeding |
Language | English |
Published |
SPIE
31.05.2023
|
Online Access | Get full text |
Cover
Loading…
Summary: | Data deduplication can effectively reduce data redundancy. However, its write performance is insufficient for existing storage systems, due to the additional calculation and I/O operations. In order to improve the deduplication speed in a distributed storage system, we propose FastDedup, a fast and effective deduplication scheme that focuses on the stored data. FastDedup improves deduplication speed through deduplication task distribution model and multi-container pool technology. Specifically, the deduplication task distribution model maintains the correctness for multiple deduplication nodes working simultaneously. The multi-container pool technology saves the operation time on the data merging stage. Evaluation results on three real backup datasets demonstrate that, compared to the unimproved technique, FastDedup increases deduplication throughput by 3.2% - 69.1%. |
---|---|
Bibliography: | Conference Date: 2023-02-17|2023-02-19 Conference Location: Hangzhou, China |
ISBN: | 9781510666290 151066629X |
ISSN: | 0277-786X |
DOI: | 10.1117/12.2680561 |