Analyzing the Efficiency and Bottleneck of Scientific Programs on Imagine Stream Processor by Simulation

Imagine stream processor has shown high performance and efficiency for media applications. Its potential for scientific applications is of great interest to the high performance computing community. This paper investigates this subject from a new angle. It roughly classifies the scientific programs...

Full description

Saved in:
Bibliographic Details
Published in2008 IEEE International Symposium on Parallel and Distributed Processing with Applications pp. 89 - 98
Main Authors Yonggang Che, Chuanfu Xu, Zhenghua Wang
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2008
Subjects
Online AccessGet full text
ISBN9780769534718
0769534716
ISSN2158-9178
DOI10.1109/ISPA.2008.17

Cover

Abstract Imagine stream processor has shown high performance and efficiency for media applications. Its potential for scientific applications is of great interest to the high performance computing community. This paper investigates this subject from a new angle. It roughly classifies the scientific programs into three classes based on their computation to memory access ratios. For each class, typical programs are programmed with StreamC/KernelC stream language and simulated based on the cycle-accurate simulator of Imagine. In-depth analysis is carried out for the performance data, with special attentions on the performance bottlenecks. The performance data obtained on Imagine are compared against data on two general-purpose x86 processors. The results show that programs with no DRAM accesses attain high floating point performance and efficiencies on Imagine. These programs' performance is only restricted by limited ILP (Instruction-Level Parallelism) and load imbalance across ALUs. Programs with computation to memory operation ratios O(n) attain absolute floating point performance on Imagine comparable to that obtained on general-purpose processors, but their floating-point efficiencies are not satisfactory. It is essential to optimize these programs for high SRF (Stream Register File) and LRF (Local Register File) reuse and high ILP on Imagine. Programs with lower computation to memory operation ratios attain much lower floating-point performance and efficiencies on Imagine, compared to those obtained on x86 processors.
AbstractList Imagine stream processor has shown high performance and efficiency for media applications. Its potential for scientific applications is of great interest to the high performance computing community. This paper investigates this subject from a new angle. It roughly classifies the scientific programs into three classes based on their computation to memory access ratios. For each class, typical programs are programmed with StreamC/KernelC stream language and simulated based on the cycle-accurate simulator of Imagine. In-depth analysis is carried out for the performance data, with special attentions on the performance bottlenecks. The performance data obtained on Imagine are compared against data on two general-purpose x86 processors. The results show that programs with no DRAM accesses attain high floating point performance and efficiencies on Imagine. These programs' performance is only restricted by limited ILP (Instruction-Level Parallelism) and load imbalance across ALUs. Programs with computation to memory operation ratios O(n) attain absolute floating point performance on Imagine comparable to that obtained on general-purpose processors, but their floating-point efficiencies are not satisfactory. It is essential to optimize these programs for high SRF (Stream Register File) and LRF (Local Register File) reuse and high ILP on Imagine. Programs with lower computation to memory operation ratios attain much lower floating-point performance and efficiencies on Imagine, compared to those obtained on x86 processors.
Author Zhenghua Wang
Chuanfu Xu
Yonggang Che
Author_xml – sequence: 1
  surname: Yonggang Che
  fullname: Yonggang Che
  organization: Sch. of Comput., Nat. Univ. of Defense Technol., Changsha
– sequence: 2
  surname: Chuanfu Xu
  fullname: Chuanfu Xu
  organization: Sch. of Comput., Nat. Univ. of Defense Technol., Changsha
– sequence: 3
  surname: Zhenghua Wang
  fullname: Zhenghua Wang
  organization: Sch. of Comput., Nat. Univ. of Defense Technol., Changsha
BookMark eNotjMFOAjEURZsIiYDs3LnpDwz2Tae0b4kEkYREktE1KcMrVGdaM1MX49cL0dVdnHPumA1CDMTYPYgZgMDHTblbzHIhzAz0DZuiNkLPUclCgxmwUQ7KZAjaDNn4aqEwORa3bNp1H0IICcYgihE7L4Kt-x8fTjydia-c85WnUPXchiN_iinVFKj65NHx8kqSvxh818ZTa5uOx8A3jT35QLxMLdnmiirqutjyQ89L33zXNvkY7tjQ2bqj6f9O2Pvz6m35km1f15vlYpt50CplaIXWel4JBKsqNcfcoZGFI6tJSnQkZK5zpaRy8mC1AgDj0CokgRoPRzlhD3-_noj2X61vbNvvi0sDEuUvvpxa_w
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ISPA.2008.17
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EndPage 98
ExternalDocumentID 4725139
Genre orig-research
GroupedDBID 6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ACGFS
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
ID FETCH-LOGICAL-i175t-9a07776c091a5c5692f9834fea7e339fe032725535f3ba751118f9a59e0979bd3
IEDL.DBID RIE
ISBN 9780769534718
0769534716
ISSN 2158-9178
IngestDate Wed Aug 27 02:18:30 EDT 2025
IsPeerReviewed false
IsScholarly false
LCCN 2008908294
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-9a07776c091a5c5692f9834fea7e339fe032725535f3ba751118f9a59e0979bd3
PageCount 10
ParticipantIDs ieee_primary_4725139
PublicationCentury 2000
PublicationDate 2008-Dec.
PublicationDateYYYYMMDD 2008-12-01
PublicationDate_xml – month: 12
  year: 2008
  text: 2008-Dec.
PublicationDecade 2000
PublicationTitle 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications
PublicationTitleAbbrev ISPDPA
PublicationYear 2008
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003188990
ssj0001967101
Score 1.411778
Snippet Imagine stream processor has shown high performance and efficiency for media applications. Its potential for scientific applications is of great interest to...
SourceID ieee
SourceType Publisher
StartPage 89
SubjectTerms Analytical models
Application software
Arithmetic
Bandwidth
Computational modeling
Computer architecture
floating-point efficiency
High performance computing
Image analysis
Imagine stream processor
performance bottleneck
performance evaluation
scientific applications
Scientific computing
Streaming media
Title Analyzing the Efficiency and Bottleneck of Scientific Programs on Imagine Stream Processor by Simulation
URI https://ieeexplore.ieee.org/document/4725139
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFG-QkydUMH6nB48OtnVt16MaCJhgSJCEG2m3NhLCZnAc5K_3tRsjGg_e1mVb1ua17-v3fg-he-NLlYBi8WgKLoqlOPdUFHLPkIDGOoAj02XPx69sOIte5nTeQA91LYzW2oHPdNdeulx-midbGyrrRRy0MRFH6AjErKzVOsRTBANlGdRjkFVwJWyIBZSa3dM8Lr12QQkcyKwi39mP4xoUL3qj6eSxRFkGP5uuOJ0zaKHx_m9LqMmquy1UN9n9InL873ROUOdQ3Ycntd46RQ2dnaHWvr0DrnZ7G707wpIdPIPBSsR9RzZhKzWxzFL8lFv240wnK5yb8iUHO7KftoivT5xneLS2TZA0trlvucZVVUK-weoLT5frqnVYB80G_bfnoVc1ZvCWYG0UnpA-55wlYGtImlAmQiNiEhktuSZEGO2TEOZGCTVESQ42XRAbIanQvuBCpeQcNbM80xcICyKJAWc4FWAJhQmTirNAB2HIfRMwai5R2y7c4qPk3lhUa3b19-1rdOzwHA5ucoOaxWarb8FoKNSdk5ZvtIe5GQ
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFG4QD3pCBeNve_DoYFvXdT2qgYACIQESbqTb2kgIm8FxkL_e126MaDx4W5dtWZu233t93_seQg_KFmEEwGLRGFwULXFuhZ7LLEUcGkgHtkwTPR8M_e7Ue53RWQU9lrkwUkpDPpNNfWli-XEabfRRWctjgMaEH6BDwH2P5tla-xMV7gNcOmUbZis4E_qQBWBNr2oW5H47pwS2ZL-Q39m1g5IWz1u98egp51k6P8uuGNTp1NBg97852WTZ3GRhM9r-knL8b4dOUGOf34dHJXKdoopMzlBtV-ABF-u9jt6NZMkWnsFgJ-K2kZvQuZpYJDF-TrX-cSKjJU5V_pIhHulPa87XJ04T3FvpMkgS6-i3WOEiLyFd4_ALjxeronhYA0077clL1ypKM1gLsDcyiwubMeZHYG0IGlGfu4oHxFNSMEkIV9ImLvSNEqpIKBhYdU6guKBc2pzxMCbnqJqkibxAmBNBFLjDMQdbyI18ETLfkY7rMls5PlWXqK4Hbv6Rq2_MizG7-vv2PTrqTgb9eb83fLtGx4bdYcgnN6iarTfyFkyILLwzM-cbmjy8Zg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2008+IEEE+International+Symposium+on+Parallel+and+Distributed+Processing+with+Applications&rft.atitle=Analyzing+the+Efficiency+and+Bottleneck+of+Scientific+Programs+on+Imagine+Stream+Processor+by+Simulation&rft.au=Yonggang+Che&rft.au=Chuanfu+Xu&rft.au=Zhenghua+Wang&rft.date=2008-12-01&rft.pub=IEEE&rft.isbn=9780769534718&rft.issn=2158-9178&rft.spage=89&rft.epage=98&rft_id=info:doi/10.1109%2FISPA.2008.17&rft.externalDocID=4725139
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2158-9178&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2158-9178&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2158-9178&client=summon