Designing Parallel Sparse Matrix Transposition Algorithm Using ELLPACK-R for GPUs
In this paper, we proposed a parallel algorithm to implement the sparse matrix transposition using ELLPACK-R format on the graphic processing units. By utilizing the tremendous memory bandwidth and the texture memory, the performance of this algorithm can be efficiently improved. Experimental result...
Saved in:
Published in | Computer Engineering and Technology pp. 61 - 68 |
---|---|
Main Authors | , , , , , |
Format | Book Chapter |
Language | English |
Published |
Berlin, Heidelberg
Springer Berlin Heidelberg
2016
|
Series | Communications in Computer and Information Science |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In this paper, we proposed a parallel algorithm to implement the sparse matrix transposition using ELLPACK-R format on the graphic processing units. By utilizing the tremendous memory bandwidth and the texture memory, the performance of this algorithm can be efficiently improved. Experimental results show that the performance of the proposed algorithm can be improved up to 8x times on Nvidia Tesla C2070, compared with the implementation on the Intel Xeon E5-2650 CPU. It also can be concluded that it is not wise to accelerate the transposition algorithm for the matrices in the ELLPACK-R format with violent divergence in the number of nonzero elements among the rows. |
---|---|
ISBN: | 9783662492826 3662492822 |
ISSN: | 1865-0929 1865-0937 |
DOI: | 10.1007/978-3-662-49283-3_7 |