Network topology optimization for data aggregation with splitting

In this paper, we develop algorithms for the data aggregation problem which arises in the context of big-data applications that employ the MapReduce operation. For the case when source racks can send their data to the aggregator using multiple paths, we show that an aggregation tree topology that mi...

Full description

Saved in:

Bibliographic Details
Published in	IEEE International Symposium on Signal Processing and Information Technology pp. 000398 - 000403
Main Authors	Das, Soham, Sahni, Sartaj
Format	Conference Proceeding
Language	English
Published	IEEE 01.12.2014
Subjects	Approximation algorithms Big Data applications Clustering algorithms Complexity theory Data Center Networks Map-Reduce tasks Network topology Optical fiber communication Optical switches Software Defined networking Topology
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we develop algorithms for the data aggregation problem which arises in the context of big-data applications that employ the MapReduce operation. For the case when source racks can send their data to the aggregator using multiple paths, we show that an aggregation tree topology that minimizes aggregation time can be constructed in polynomial time. We consider also the problem of constructing aggregation trees that minimize total network traffic subject to the primary constraint that aggregation time is minimized. Heuristics for this problem are presented. Experiments show that allowing multiple paths reduces aggregation time by up to 99% relative to the aggregation trees constructed using the LPT rule [3]. This reduction in aggregation time, however, comes with up to 35% increase in total network traffic when racks have more than 2 optical links.
ISSN:	2162-7843
DOI:	10.1109/ISSPIT.2014.7300622