Parallel Training of a Back-Propagation Neural Network Using CUDA
The Artificial Neural Networks (ANN) training represents a time-consuming process in machine learning systems. In this work we provide an implementation of the back-propagation algorithm on CUDA, a parallel computing architecture developed by NVIDIA. Using CUBLAS, a CUDA implementation of the Basic...
Saved in:
Published in | 2010 International Conference on Machine Learning and Applications pp. 307 - 312 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.12.2010
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The Artificial Neural Networks (ANN) training represents a time-consuming process in machine learning systems. In this work we provide an implementation of the back-propagation algorithm on CUDA, a parallel computing architecture developed by NVIDIA. Using CUBLAS, a CUDA implementation of the Basic Linear Algebra Subprograms library (BLAS), the process is simplified, however, the use of kernels was necessary since CUBLAS does not have all the required operations. The implementation was tested with two standard benchmark data sets and the results show that the parallel training algorithm runs 63 times faster than its sequential version. |
---|---|
ISBN: | 1424492114 9781424492114 |
DOI: | 10.1109/ICMLA.2010.52 |