Network topology optimization for data aggregation with splitting

In this paper, we develop algorithms for the data aggregation problem which arises in the context of big-data applications that employ the MapReduce operation. For the case when source racks can send their data to the aggregator using multiple paths, we show that an aggregation tree topology that mi...

Full description

Saved in:
Bibliographic Details
Published inIEEE International Symposium on Signal Processing and Information Technology pp. 000398 - 000403
Main Authors Das, Soham, Sahni, Sartaj
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2014
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this paper, we develop algorithms for the data aggregation problem which arises in the context of big-data applications that employ the MapReduce operation. For the case when source racks can send their data to the aggregator using multiple paths, we show that an aggregation tree topology that minimizes aggregation time can be constructed in polynomial time. We consider also the problem of constructing aggregation trees that minimize total network traffic subject to the primary constraint that aggregation time is minimized. Heuristics for this problem are presented. Experiments show that allowing multiple paths reduces aggregation time by up to 99% relative to the aggregation trees constructed using the LPT rule [3]. This reduction in aggregation time, however, comes with up to 35% increase in total network traffic when racks have more than 2 optical links.
ISSN:2162-7843
DOI:10.1109/ISSPIT.2014.7300622