Energy Analysis of Hadoop Cluster Failure Recovery

Energy efficiency is now used as an important metric for evaluating a computing system. However, saving energy is a big challenge due to many constraints. For example, in one of the most popular distributed processing frameworks, Hadoop, three replicas of each data block are randomly distributed in...

Full description

Saved in:
Bibliographic Details
Published in2013 International Conference on Parallel and Distributed Computing, Applications and Technologies pp. 141 - 146
Main Authors Weiyue Xu, Ying Lu
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2013
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Energy efficiency is now used as an important metric for evaluating a computing system. However, saving energy is a big challenge due to many constraints. For example, in one of the most popular distributed processing frameworks, Hadoop, three replicas of each data block are randomly distributed in order to improve performance and fault tolerance. But such a mechanism limits the largest number of machines that can be turned off to save energy without affecting the data availability. To overcome this limitation, previous research introduces a new mechanism called covering subset which maintains a set of active nodes to ensure the immediate availability of data, even when all other nodes are turned off. This covering subset based mechanism works smoothly if no failure happens. However, a node in the covering subset may fail. In this paper, we study the energy-efficient failure recovery in Hadoop clusters. Rather than only using the replication as adopted by a Hadoop system by default, we investigate both replication and erasure coding as possible redundancy mechanisms. We develop failure recovery algorithms for both systems and analytically compare their energy efficiency.
AbstractList Energy efficiency is now used as an important metric for evaluating a computing system. However, saving energy is a big challenge due to many constraints. For example, in one of the most popular distributed processing frameworks, Hadoop, three replicas of each data block are randomly distributed in order to improve performance and fault tolerance. But such a mechanism limits the largest number of machines that can be turned off to save energy without affecting the data availability. To overcome this limitation, previous research introduces a new mechanism called covering subset which maintains a set of active nodes to ensure the immediate availability of data, even when all other nodes are turned off. This covering subset based mechanism works smoothly if no failure happens. However, a node in the covering subset may fail. In this paper, we study the energy-efficient failure recovery in Hadoop clusters. Rather than only using the replication as adopted by a Hadoop system by default, we investigate both replication and erasure coding as possible redundancy mechanisms. We develop failure recovery algorithms for both systems and analytically compare their energy efficiency.
Author Ying Lu
Weiyue Xu
Author_xml – sequence: 1
  surname: Weiyue Xu
  fullname: Weiyue Xu
  email: weiyue@cse.unl.edu
  organization: Dept. of Comput. Sci. & Eng., Univ. of Nebraska-Lincoln, Lincoln, NE, USA
– sequence: 2
  surname: Ying Lu
  fullname: Ying Lu
  email: ylu@cse.unl.edu
  organization: Dept. of Comput. Sci. & Eng., Univ. of Nebraska-Lincoln, Lincoln, NE, USA
BookMark eNotzk9LwzAYgPEIE5yzR09e-gVakzdJk_dY6v4IA0XmeSTtG6nUdiSb0G-voKfn9uO5ZYtxGomxe8FLITg-vj419aEELmQJeMUyNFYogwhKWFywJUiDhZYabliW0ifnXApr0eKSwXqk-DHn9eiGOfUpn0K-c900nfJmuKQzxXzj-uESKX-jdvqmON-x6-CGRNl_V-x9sz40u2L_sn1u6n3RAopzgehQqw41ubaV0GIwYLSvguc2cKW8sCA7Er-rQNp3AYWvkLh1tpNkvFyxhz-3J6LjKfZfLs7HCrkCVckfyCpF8w
CODEN IEEPAD
CitedBy_id crossref_primary_10_2139_ssrn_4123756
crossref_primary_10_1109_ACCESS_2023_3267106
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/PDCAT.2013.29
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781479924189
1479924180
1479924199
9781479924196
EndPage 146
ExternalDocumentID 6904246
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
RNS
ID FETCH-LOGICAL-c291t-99a954d95eacc32c9f7275b6fb08f044b1823de18142e5bdf91b69e08a8d3e7b3
IEDL.DBID RIE
ISSN 2379-5352
IngestDate Wed Jun 26 19:28:49 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c291t-99a954d95eacc32c9f7275b6fb08f044b1823de18142e5bdf91b69e08a8d3e7b3
OpenAccessLink https://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1273&context=cseconfwork
PageCount 6
ParticipantIDs ieee_primary_6904246
PublicationCentury 2000
PublicationDate 20131201
PublicationDateYYYYMMDD 2013-12-01
PublicationDate_xml – month: 12
  year: 2013
  text: 20131201
  day: 01
PublicationDecade 2010
PublicationTitle 2013 International Conference on Parallel and Distributed Computing, Applications and Technologies
PublicationTitleAbbrev PDCAT
PublicationYear 2013
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003188989
Score 1.5747494
Snippet Energy efficiency is now used as an important metric for evaluating a computing system. However, saving energy is a big challenge due to many constraints. For...
SourceID ieee
SourceType Publisher
StartPage 141
SubjectTerms Data transfer
Decoding
Distributed databases
Encoding
Energy consumption
Fault tolerance
Fault tolerant systems
Title Energy Analysis of Hadoop Cluster Failure Recovery
URI https://ieeexplore.ieee.org/document/6904246
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwED21nZgKtIhveWDEaeLYiT2i0qpCKurQSt0qf0VCoKZCyVB-PbbTD4QY2KJMjs_2u3PevQfwYHOV5pZT7B1pMGVcYZfUp9hIpjWxCePBvm36mk0W9GXJli14PPTCWGsD-cxG_jH8yzelrv1V2cBVcpTQrA1tHpOmV-twn-LWpndCDF5yucBeteQoqTmYPQ-f5p7IlUY-mfxhpBJwZNyF6X4EDX3kPaorFemvX-KM_x3iKfSPHXtodsCiM2jZ9Tl095YNaLeDe0BGodcP7bVIUFkgd_iU5QYNP2ovmoDG8s1T1ZEvTN063_ZhMR7NhxO8s03AmoikcnMtBaNGMHem6pRoUbgchamsUDEvYkqVKylSYx20U2KZMoVIVCZszCU3qY_dBXTW5dpeAspZkklhXM6iHdJrJmTsCh7FmJKxSYy-gp6fgtWmUcZY7b7--u_XN3DiI9CQQW6hU33W9s5BeqXuQyy_AW4Pnl4
link.rule.ids 310,311,786,790,795,796,802,27956,55107
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwED6VMsBUoEW88cBI0jzsJB5RaVWgrTq0UrfKr0gIlFQoGeDXc076QIiBLcrk-Gx_d8533wdwZ2IZxiahjnWkcShLpINJfehowZQKjM-Syr5tPImGc_q8YIsG3G97YYwxFfnMuPax-pevc1Xaq7IuVnI0oNEe7CPOe3HdrbW9UcHVab0QKze5mDtWt2QnqtmdPvYeZpbKFbo2nfxhpVIhyaAF480YagLJm1sW0lVfv-QZ_zvII-jsevbIdItGx9Aw2Qm0NqYNZL2H2xD0q24_slEjIXlK8PjJ8xXpvZdWNoEMxKslqxNbmuJK_-zAfNCf9YbO2jjBUQH3C5xtwRnVnOGpqsJA8RSzFCajVHpJ6lEqsagItUFwp4FhUqfclxE3XiISHdronUIzyzNzBiRmfiS4xqxFIdYrxoWHJY9kTApP-1qdQ9tOwXJVa2Ms119_8ffrWzgYzsaj5ehp8nIJhzYaNTXkCprFR2muEeALeVPF9Rv6wqGy
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2013+International+Conference+on+Parallel+and+Distributed+Computing%2C+Applications+and+Technologies&rft.atitle=Energy+Analysis+of+Hadoop+Cluster+Failure+Recovery&rft.au=Weiyue+Xu&rft.au=Ying+Lu&rft.date=2013-12-01&rft.pub=IEEE&rft.issn=2379-5352&rft.spage=141&rft.epage=146&rft_id=info:doi/10.1109%2FPDCAT.2013.29&rft.externalDocID=6904246
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2379-5352&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2379-5352&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2379-5352&client=summon