Alleviation of Disk I/O Contention in Virtualized Settings for Data-Intensive Computing

Steady growth in storage and processing capabilities has led to the accumulation of large-scale datasets that contain valuable insight into the interactions of complex systems, long-and short-term trends, and real-world phenomena. Converged infrastructure, operating on cloud deployments and private...

Full description

Saved in:
Bibliographic Details
Published in2015 IEEE/ACM 2nd International Symposium on Big Data Computing (BDC) pp. 1 - 10
Main Authors Malensek, Matthew, Pallickara, Sangmi Lee, Pallickara, Shrideep
Format Conference Proceeding
LanguageEnglish
Published ACM 01.12.2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Steady growth in storage and processing capabilities has led to the accumulation of large-scale datasets that contain valuable insight into the interactions of complex systems, long-and short-term trends, and real-world phenomena. Converged infrastructure, operating on cloud deployments and private clusters, has emerged as an energy-efficient and cost-effective means of coping with these computing demands. However, increased collocation of storage and processing activities often leads to greater contention for resources in high-use situations. This issue is particularly pronounced when running distributed computations (such as MapReduce applications), because overall execution times are dependent on the completion time of the slowest task(s). In this study, we propose a framework that makes opinionated disk scheduling decisions to ensure high throughput for tasks that use I/O resources conservatively, while still maintaining the average performance of long-running batch processing operations. Our solution does not require modification of client applications or virtual machines, and we illustrate its efficacy on a cluster of 1,200 VMs with a variety of datasets that span over 1 Petabyte of information, in situations with high disk interference, our algorithm resulted in a 20% improvement in MapReduce completion times.
DOI:10.1109/BDC.2015.32