Similarity based deduplication for secondary storage

For similarity based deduplication of remote data repositories, a parse module generates a rolling hash value based on a portion of an incoming stream of backup data. A comparison module compares the rolling hash value with entries stored in a rolling hash index, and in response to matching the roll...

Full description

Saved in:

Bibliographic Details
Main Authors	Kishi, Gregory T, Dain, Joseph W
Format	Patent
Language	English
Published	08.10.2019
Subjects	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	For similarity based deduplication of remote data repositories, a parse module generates a rolling hash value based on a portion of an incoming stream of backup data. A comparison module compares the rolling hash value with entries stored in a rolling hash index, and in response to matching the rolling hash value with an entry in the rolling hash index, generates a strong hash value and determines if a match of the strong hash value exists in a first strong hash index. The comparison module, in response to a determination that the match does not exist in the first strong hash index, compares the strong hash value with entries in a second strong hash index in the remote data repository. A migration module, in response to a determination that the strong hash value does not match any hash entries, stores the portion of backup data as new data.
Bibliography:	Application Number: US201615084322