GLES: A Practical GPGPU Optimizing Compiler Using Data Sharing and Thread Coarsening

Writing optimized CUDA programs for General Purpose Graphics Processing Unit (GPGPU) is complicated and error-prone. Most of the former compiler optimization methods are impractical for many applications that contain divergent control flows, and they failed to fully exploit optimization opportunitie...

Full description

Saved in:
Bibliographic Details
Published inLanguages and Compilers for Parallel Computing pp. 36 - 50
Main Authors Lin, Zhen, Gao, Xiaopeng, Wan, Han, Jiang, Bo
Format Book Chapter
LanguageEnglish
Published Cham Springer International Publishing 2015
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Writing optimized CUDA programs for General Purpose Graphics Processing Unit (GPGPU) is complicated and error-prone. Most of the former compiler optimization methods are impractical for many applications that contain divergent control flows, and they failed to fully exploit optimization opportunities in data sharing and thread coarsening. In this paper, we present GLES, an optimizing compiler for GPGPU programs. GLES proposes two optimization techniques based on divergence analysis. The first one is data sharing optimization for data reuse and bandwidth enhancement. The other one is thread granularity coarsening for reducing redundant instructions. Our experiments on 6 real-world programs show that GPGPU programs optimized by GLES achieve similar performance compared with manually tuned GPGPU programs. Furthermore, GLES is not only applicable to a much wider range of GPGPU programs than the state-of-art GPGPU optimizing compiler, but it also achieves higher or close performance on 8 out of 9 benchmarks.
ISBN:331917472X
9783319174723
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-319-17473-0_3