De-duplication of data in a data processing system

In a data processing system, a method includes deleting a particular copy of a data item when at least one other copy of the data item is available. The presence of another copy of the data item is determined, at least in part, based on an identifier for the data item, the identifier having been com...

Full description

Saved in:
Bibliographic Details
Main Authors Farber, David A, Lachman, Ronald D
Format Patent
LanguageEnglish
Published 24.05.2011
Online AccessGet full text

Cover

Loading…
More Information
Summary:In a data processing system, a method includes deleting a particular copy of a data item when at least one other copy of the data item is available. The presence of another copy of the data item is determined, at least in part, based on an identifier for the data item, the identifier having been computed using all of the data in the data item and only the data in the data item, wherein two identical data items in the data processing system will have identical identifiers. The particular copy of the data item may be deleted if another copy of the data is determined to be present on another processor in the system or on the same processor. The identifier of the data item is computed using a function such as a message digest or hash function which may be: MD4, MD5, or SHA.