A Scalable Task Parallelism Approach for LU Decomposition with Multicore CPUs
Many scientific applications have linear systems A · x = b which need to be solved for different vectors b. LU decomposition, which is a variant of Gaussian Elimination, is an efficient technique to solve a linear system. The main idea of the LU decomposition is to factorize A into an upper (U) tria...
Saved in:
Published in | 2016 Second International Workshop on Extreme Scale Programming Models and Middlewar (ESPM2) pp. 17 - 23 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.11.2016
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Many scientific applications have linear systems A · x = b which need to be solved for different vectors b. LU decomposition, which is a variant of Gaussian Elimination, is an efficient technique to solve a linear system. The main idea of the LU decomposition is to factorize A into an upper (U) triangular and a lower (L) triangular matrix such that A = LU. This paper presents an OpenMP task parallel approach for the LU factorization of dense matrices. The tasking model is based on the individual computational tasks which occur during the block-wise LU factorization. We describe the right-looking variant of the LU decomposition algorithm in the task parallel approach, and provide an efficient implementation of the algorithm for shared memory machines. We demonstrate that with the task scheduling features provided by OpenMP 4.0, the right-looking LU decomposition can scale well. We then conduct an experimental evaluation of the task parallel implementation in comparison with the parallel-for implementation of the Gaussian elimination with pivoting and LU decomposition using the GNU Scientific Library on a multicore platform. From the experiments we conclude that the proposed task-based implementation is a good solution for solving large systems of linear equations using LU decomposition. |
---|---|
DOI: | 10.1109/ESPM2.2016.008 |