Outputting map-reduce jobs to an archive file
Disclosed is a method of outputting map-reduce jobs to an archive file. The method includes, providing an archive manager and exposing an interface to be called from map-reduce jobs to output to an archive file in a map-reduce distributed file system. Using a buffering database as a temporary cache...
Saved in:
Main Authors | , |
---|---|
Format | Patent |
Language | English |
Published |
16.03.2016
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Disclosed is a method of outputting map-reduce jobs to an archive file. The method includes, providing an archive manager and exposing an interface to be called from map-reduce jobs to output to an archive file in a map-reduce distributed file system. Using a buffering database as a temporary cache to buffer updates to the archive file. The handling by the archive manager of the calls from map-reduce jobs allows, reading directly from an archive file or from a job index at the buffering database and writing to a job index at the buffering database used as a temporary cache to buffer updates, and outputting updates from a job index to the archive file. The call handling may includes receiving a read call for a task of a map-reduce job, connecting to the buffering database, looking up a unique token for the job in a pending index and a committed index provided by the database, and depending on the status of the job either reading from the archive file or reading from a job index. |
---|---|
Bibliography: | Application Number: GB20140016018 |