Selection and replacement algorithms for memory performance improvement in Spark

Summary As a parallel computation framework, Spark can cache repeatedly resilient distribution datasets (RDDs) partitions in different nodes to speed up the process of computation. However, Spark does not have a good mechanism to select reasonable RDDs to cache their partitions in limited memory. In...

Full description

Saved in:
Bibliographic Details
Published inConcurrency and computation Vol. 28; no. 8; pp. 2473 - 2486
Main Authors Duan, Mingxing, Li, Kenli, Tang, Zhuo, Xiao, Guoqing, Li, Keqin
Format Journal Article
LanguageEnglish
Published Blackwell Publishing Ltd 10.06.2016
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Summary As a parallel computation framework, Spark can cache repeatedly resilient distribution datasets (RDDs) partitions in different nodes to speed up the process of computation. However, Spark does not have a good mechanism to select reasonable RDDs to cache their partitions in limited memory. In this paper, we propose a novel selection algorithm, by which Spark can automatically select the RDDs to cache their partitions in memory according to the number of use for RDDs. Our selection algorithm speeds up iterative computations. Nevertheless, when many new RDDs are chosen to cache their partitions in memory while limited memory has been full of them, the system will adopt the least recently used (LRU) replacement algorithm. However, the LRU algorithm only considers whether the RDDs partitions are recently used while ignoring other factors such as the computation cost and so on. We also put forward a novel replacement algorithm called weight replacement (WR) algorithm, which takes comprehensive consideration of the partitions computation cost, the number of use for partitions, and the sizes of the partitions. Experiment results show that with our selection algorithm, Spark calculates faster than without the algorithm, and we find that Spark with WR algorithm shows better performance. Copyright © 2015 John Wiley & Sons, Ltd.
Bibliography:ark:/67375/WNG-T16RV2GN-M
istex:925DE55324FE265B1A97E7CA7176A8898AD143FF
ArticleID:CPE3584
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1532-0626
1532-0634
DOI:10.1002/cpe.3584