MapReduce Tuning to Improve Distributed Machine Learning Performance
In this paper, we show how MapReduce parameters affect distributed processing of machine learning programs, which are supported by machine learning libraries, such as Hadoop Mahout and Spark MLlib. We constructed virtualized cluster on top of Docker containers and measured distributed machine learni...
Saved in:
Published in | 2018 IEEE First International Conference on Artificial Intelligence and Knowledge Engineering (AIKE) pp. 198 - 200 |
---|---|
Main Authors | , , , , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.09.2018
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this paper, we show how MapReduce parameters affect distributed processing of machine learning programs, which are supported by machine learning libraries, such as Hadoop Mahout and Spark MLlib. We constructed virtualized cluster on top of Docker containers and measured distributed machine learning performance, while changing Hadoop parameters, such as number of replica, block size and memory buffer size. |
---|---|
DOI: | 10.1109/AIKE.2018.00045 |