Search Results - "Tomov, Stanimire" :: K.UTB vyhledávací portál

Loading…

Performance, Design, and Autotuning of Batched GEMM for GPUs

by Abdelfattah, Ahmad, Haidar, Azzam, Tomov, Stanimire, Dongarra, Jack
Published in High Performance Computing

Get full text

Book Chapter

Loading…

From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming

by Du, Peng, Weber, Rick, Luszczek, Piotr, Tomov, Stanimire, Peterson, Gregory, Dongarra, Jack
Published in Parallel computing (01.08.2012)

Get full text

Journal Article

Loading…

Accelerating the SVD two stage bidiagonal reduction and divide and conquer using GPUs

by Gates, Mark, Tomov, Stanimire, Dongarra, Jack
Published in Parallel computing (01.05.2018)

Get full text

Journal Article

Loading…

Reducing the amount of out‐of‐core data access for GPU‐accelerated randomized SVD

by Lu, Yuechao, Yamazaki, Ichitaro, Ino, Fumihiko, Matsushita, Yasuyuki, Tomov, Stanimire, Dongarra, Jack
Published in Concurrency and computation (10.10.2020)

Get full text

Journal Article

Loading…

Mixed-Precision Orthogonalization Scheme and Adaptive Step Size for Improving the Stability and Performance of CA-GMRES on GPUs

by Yamazaki, Ichitaro, Tomov, Stanimire, Dong, Tingxing, Dongarra, Jack
Published in High Performance Computing for Computational Science -- VECPAR 2014

Get full text

Book Chapter

Loading…

Exploiting Block Structures of KKT Matrices for Efficient Solution of Convex Optimization Problems

by Iqbal, Zafar, Nooshabadi, Saeid, Yamazaki, Ichitaro, Tomov, Stanimire, Dongarra, Jack
Published in IEEE access (2021)

Get full text

Journal Article

Loading…

Non‐GPU‐resident symmetric indefinite factorization

by Yamazaki, Ichitaro, Tomov, Stanimire, Dongarra, Jack
Published in Concurrency and computation (10.03.2017)

Get full text

Journal Article

Loading…

Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing

by Tomov, Stanimire, Nath, Rajib, Dongarra, Jack
Published in Parallel computing (01.12.2010)

Get full text

Journal Article

Loading…

Optimizing the SVD Bidiagonalization Process for a Batch of Small Matrices

by Dong, Tingxing, Haidar, Azzam, Tomov, Stanimire, Dongarra, Jack
Published in Procedia computer science (2017)

Get full text

Journal Article

Loading…

A Framework for Batched and GPU-Resident Factorization Algorithms Applied to Block Householder Transformations

by Haidar, Azzam, Dong, Tingxing Tim, Tomov, Stanimire, Luszczek, Piotr, Dongarra, Jack
Published in High Performance Computing (01.01.2015)

Get full text

Book Chapter

Loading…

State-of-the-art eigensolvers for electronic structure calculations of large scale nano-systems

by Vömel, Christof, Tomov, Stanimire Z., Marques, Osni A., Canning, A., Wang, Lin-Wang, Dongarra, Jack J.
Published in Journal of computational physics (20.07.2008)

Get full text

Journal Article

Loading…

Fast Batched Matrix Multiplication for Small Sizes Using Half-Precision Arithmetic on GPUs

by Abdelfattah, Ahmad, Tomov, Stanimire, Dongarra, Jack
Published in 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (01.05.2019)

Get full text

Conference Proceeding

Loading…

Optimizing the Fast Fourier Transform Using Mixed Precision on Tensor Core Hardware

by Sorna, Anumeena, Cheng, Xiaohe, D'Azevedo, Eduardo, Won, Kwai, Tomov, Stanimire
Published in 2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW) (01.12.2018)

Get full text

Conference Proceeding

Loading…

Towards dense linear algebra for hybrid GPU accelerated manycore systems

by Tomov, Stanimire, Dongarra, Jack, Baboulin, Marc
Published in Parallel computing (01.06.2010)

Get full text

Journal Article

Loading…

The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot

by Vömel, Christof, Tomov, Stanimire Z., Wang, Lin-Wang, Marques, Osni A., Dongarra, Jack J.
Published in Journal of computational physics (01.05.2007)

Get full text

Journal Article

Loading…

Batched sparse and mixed-precision linear algebra interface for efficient use of GPU hardware accelerators in scientific applications

by Luszczek, Piotr, Abdelfattah, Ahmad, Anzt, Hartwig, Suzuki, Atsushi, Tomov, Stanimire
Published in Future generation computer systems (01.11.2024)

Get full text

Journal Article

Loading…

Solving Linear Diophantine Systems on Parallel Architectures

by Zaitsev, Dmitry, Tomov, Stanimire, Dongarra, Jack
Published in IEEE transactions on parallel and distributed systems (01.05.2019)

Get full text

Journal Article

Loading…

Impacts of Multi-GPU MPI Collective Communications on Large FFT Computation

by Ayala, Alan, Tomov, Stanimire, Luo, Xi, Shaeik, Hejer, Haidar, Azzam, Bosilca, George, Dongarra, Jack
Published in 2019 IEEE/ACM Workshop on Exascale MPI (ExaMPI) (01.11.2019)

Get full text

Conference Proceeding

Loading…

Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems

by Haidar, Azzam, Bayraktar, Harun, Tomov, Stanimire, Dongarra, Jack, Higham, Nicholas J.
Published in Proceedings of the Royal Society. A, Mathematical, physical, and engineering sciences (01.11.2020)

Get full text

Journal Article

Loading…

Analysis and Design Techniques towards High-Performance and Energy-Efficient Dense Linear Solvers on GPUs

by Abdelfattah, Ahmad, Haidar, Azzam, Tomov, Stanimire, Dongarra, Jack
Published in IEEE transactions on parallel and distributed systems (01.12.2018)

Get full text

Journal Article

Refine Results

Format

Subject Area

Topic

Language

Year of Publication

Database