A dynamic block device reconfiguration algorithm in virtual MapReduce cluster

With the advances of cloud computing and virtualization technologies, running MapReduce applications over clouds has been attracting more and more attention in recent years. However, as a fundamental problem, the performance of MapReduce applications can sometimes be severely degraded due to the ove...

Full description

Saved in:
Bibliographic Details
Published inCluster computing Vol. 17; no. 4; pp. 1171 - 1183
Main Authors Lee, Kwonyong, Nam, Yoonsung, Kim, Taekhee, Park, Sungyong, Lee, Hyuk-Jun, Yang, Jihoon
Format Journal Article
LanguageEnglish
Published Boston Springer US 01.12.2014
Springer Nature B.V
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:With the advances of cloud computing and virtualization technologies, running MapReduce applications over clouds has been attracting more and more attention in recent years. However, as a fundamental problem, the performance of MapReduce applications can sometimes be severely degraded due to the overheads from I/O virtualization and resource competitions among virtual machines. In this paper, we propose a dynamic block device reconfiguration algorithm in virtual MapReduce clusters, which reduces the data transfer time between virtual machines and thereby improving the performance of MapReduce applications on top of the clouds. The proposed algorithm utilizes a block device reconfiguration scheme, where a block device attached to a virtual machine can be dynamically detached and reattached to other virtual machines at runtime. This scheme allows us to move files easily across different virtual machines without any network transfers between virtual machines. This algorithm is also dynamic in a sense that it estimates the total data transfer times between virtual machines using multiple regression analysis based on CPU utilization and data size, and adaptively determines a least-cost data transfer path between a mapper virtual machine and a reducer virtual machine. We have implemented our algorithm in Hadoop MapReduce. The benchmarking results showed that the overheads incurred by transferring data from mapper virtual machines to reducer virtual machines are minimized and the execution times of MapReduce applications are shortened up to 14 %.
ISSN:1386-7857
1573-7543
DOI:10.1007/s10586-014-0375-y