Outputting map-reduce jobs to an archive file

Disclosed is a method of outputting map-reduce jobs to an archive file. The method includes, providing an archive manager and exposing an interface to be called from map-reduce jobs to output to an archive file in a map-reduce distributed file system. Using a buffering database as a temporary cache...

Full description

Saved in:

Bibliographic Details
Main Authors	NIALL MCCARROLL, CURTIS NORMAN BROWNING
Format	Patent
Language	English
Published	16.03.2016
Subjects	CALCULATING COMPUTING COUNTING ELECTRIC DIGITAL DATA PROCESSING PHYSICS
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Disclosed is a method of outputting map-reduce jobs to an archive file. The method includes, providing an archive manager and exposing an interface to be called from map-reduce jobs to output to an archive file in a map-reduce distributed file system. Using a buffering database as a temporary cache to buffer updates to the archive file. The handling by the archive manager of the calls from map-reduce jobs allows, reading directly from an archive file or from a job index at the buffering database and writing to a job index at the buffering database used as a temporary cache to buffer updates, and outputting updates from a job index to the archive file. The call handling may includes receiving a read call for a task of a map-reduce job, connecting to the buffering database, looking up a unique token for the job in a pending index and a committed index provided by the database, and depending on the status of the job either reading from the archive file or reading from a job index.
Bibliography:	Application Number: GB20140016018