Similarity based deduplication for secondary storage
For similarity based deduplication of remote data repositories, a parse module generates a rolling hash value based on a portion of an incoming stream of backup data. A comparison module compares the rolling hash value with entries stored in a rolling hash index, and in response to matching the roll...
Saved in:
Main Authors | , |
---|---|
Format | Patent |
Language | English |
Published |
08.10.2019
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | For similarity based deduplication of remote data repositories, a parse module generates a rolling hash value based on a portion of an incoming stream of backup data. A comparison module compares the rolling hash value with entries stored in a rolling hash index, and in response to matching the rolling hash value with an entry in the rolling hash index, generates a strong hash value and determines if a match of the strong hash value exists in a first strong hash index. The comparison module, in response to a determination that the match does not exist in the first strong hash index, compares the strong hash value with entries in a second strong hash index in the remote data repository. A migration module, in response to a determination that the strong hash value does not match any hash entries, stores the portion of backup data as new data. |
---|---|
Bibliography: | Application Number: US201615084322 |