Outputting map-reduce jobs to an archive file

Disclosed is a method of outputting map-reduce jobs to an archive file. The method includes, providing an archive manager and exposing an interface to be called from map-reduce jobs to output to an archive file in a map-reduce distributed file system. Using a buffering database as a temporary cache...

Full description

Saved in:
Bibliographic Details
Main Authors NIALL MCCARROLL, CURTIS NORMAN BROWNING
Format Patent
LanguageEnglish
Published 16.03.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Disclosed is a method of outputting map-reduce jobs to an archive file. The method includes, providing an archive manager and exposing an interface to be called from map-reduce jobs to output to an archive file in a map-reduce distributed file system. Using a buffering database as a temporary cache to buffer updates to the archive file. The handling by the archive manager of the calls from map-reduce jobs allows, reading directly from an archive file or from a job index at the buffering database and writing to a job index at the buffering database used as a temporary cache to buffer updates, and outputting updates from a job index to the archive file. The call handling may includes receiving a read call for a task of a map-reduce job, connecting to the buffering database, looking up a unique token for the job in a pending index and a committed index provided by the database, and depending on the status of the job either reading from the archive file or reading from a job index.
Bibliography:Application Number: GB20140016018