Similarity based deduplication for secondary storage

For similarity based deduplication of remote data repositories, a parse module generates a rolling hash value based on a portion of an incoming stream of backup data. A comparison module compares the rolling hash value with entries stored in a rolling hash index, and in response to matching the roll...

Full description

Saved in:
Bibliographic Details
Main Authors Kishi, Gregory T, Dain, Joseph W
Format Patent
LanguageEnglish
Published 08.10.2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:For similarity based deduplication of remote data repositories, a parse module generates a rolling hash value based on a portion of an incoming stream of backup data. A comparison module compares the rolling hash value with entries stored in a rolling hash index, and in response to matching the rolling hash value with an entry in the rolling hash index, generates a strong hash value and determines if a match of the strong hash value exists in a first strong hash index. The comparison module, in response to a determination that the match does not exist in the first strong hash index, compares the strong hash value with entries in a second strong hash index in the remote data repository. A migration module, in response to a determination that the strong hash value does not match any hash entries, stores the portion of backup data as new data.
Bibliography:Application Number: US201615084322