Analyzing the Efficiency and Bottleneck of Scientific Programs on Imagine Stream Processor by Simulation
Imagine stream processor has shown high performance and efficiency for media applications. Its potential for scientific applications is of great interest to the high performance computing community. This paper investigates this subject from a new angle. It roughly classifies the scientific programs...
Saved in:
Published in | 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications pp. 89 - 98 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.12.2008
|
Subjects | |
Online Access | Get full text |
ISBN | 9780769534718 0769534716 |
ISSN | 2158-9178 |
DOI | 10.1109/ISPA.2008.17 |
Cover
Summary: | Imagine stream processor has shown high performance and efficiency for media applications. Its potential for scientific applications is of great interest to the high performance computing community. This paper investigates this subject from a new angle. It roughly classifies the scientific programs into three classes based on their computation to memory access ratios. For each class, typical programs are programmed with StreamC/KernelC stream language and simulated based on the cycle-accurate simulator of Imagine. In-depth analysis is carried out for the performance data, with special attentions on the performance bottlenecks. The performance data obtained on Imagine are compared against data on two general-purpose x86 processors. The results show that programs with no DRAM accesses attain high floating point performance and efficiencies on Imagine. These programs' performance is only restricted by limited ILP (Instruction-Level Parallelism) and load imbalance across ALUs. Programs with computation to memory operation ratios O(n) attain absolute floating point performance on Imagine comparable to that obtained on general-purpose processors, but their floating-point efficiencies are not satisfactory. It is essential to optimize these programs for high SRF (Stream Register File) and LRF (Local Register File) reuse and high ILP on Imagine. Programs with lower computation to memory operation ratios attain much lower floating-point performance and efficiencies on Imagine, compared to those obtained on x86 processors. |
---|---|
ISBN: | 9780769534718 0769534716 |
ISSN: | 2158-9178 |
DOI: | 10.1109/ISPA.2008.17 |