A Regression-based Model for End-to-End Latency Prediction for DNN Execution on GPUs

Deep neural networks (DNNs) have become increasingly popular in many domains as they reduce the requirement for human effort. However, today's DNN applications suffer from high computational complexity and sub-optimal device utilization. To solve this problem, researchers have been proposing ne...

Full description

Saved in:
Bibliographic Details
Published in2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) pp. 343 - 345
Main Authors Li, Ying, Sun, Yifan, Jog, Adwait
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.04.2023
Subjects
Online AccessGet full text
DOI10.1109/ISPASS57527.2023.00047

Cover

Abstract Deep neural networks (DNNs) have become increasingly popular in many domains as they reduce the requirement for human effort. However, today's DNN applications suffer from high computational complexity and sub-optimal device utilization. To solve this problem, researchers have been proposing new system design solutions, which require performance models to help them with pre-product concept validation. This paper discusses how to build a simple, yet accurate, performance model for DNNs on GPUs. Our observations demonstrate prevalent linear relationships between the GPU execution times and operation counts of DNNs layers. Our proposed linear-regression-based execution time predictor can make predictions with an error rate of 28%. 1 1 This material is based upon work supported in part by the Google Research Scholar Award and William & Mary. This work was performed in part using the computing facilities at William & Mary and Google Cloud. This work was done while Jog was with William & Mary. Jog is currently with the University of Virginia.
AbstractList Deep neural networks (DNNs) have become increasingly popular in many domains as they reduce the requirement for human effort. However, today's DNN applications suffer from high computational complexity and sub-optimal device utilization. To solve this problem, researchers have been proposing new system design solutions, which require performance models to help them with pre-product concept validation. This paper discusses how to build a simple, yet accurate, performance model for DNNs on GPUs. Our observations demonstrate prevalent linear relationships between the GPU execution times and operation counts of DNNs layers. Our proposed linear-regression-based execution time predictor can make predictions with an error rate of 28%. 1 1 This material is based upon work supported in part by the Google Research Scholar Award and William & Mary. This work was performed in part using the computing facilities at William & Mary and Google Cloud. This work was done while Jog was with William & Mary. Jog is currently with the University of Virginia.
Author Jog, Adwait
Li, Ying
Sun, Yifan
Author_xml – sequence: 1
  givenname: Ying
  surname: Li
  fullname: Li, Ying
  email: yli81@wm.edu
  organization: William & Mary,USA
– sequence: 2
  givenname: Yifan
  surname: Sun
  fullname: Sun, Yifan
  email: ysun25@wm.edu
  organization: William & Mary,USA
– sequence: 3
  givenname: Adwait
  surname: Jog
  fullname: Jog, Adwait
  email: ajog@virginia.edu
  organization: William & Mary,USA
BookMark eNotjNFKwzAUhiPohc69gUheIPWkSUxzWWadgzqL3a5HmnMihdlKW8G9vWUKP3zw8_HdsMuu74ixewmJlOAeNnWV17WxJrVJCqlKAEDbC7Z01mXKgHJWObhmu5y_08dA49j2nWj8SMhfe6Qjj_3Aiw7F1IsZvPQTdeHEq4GwDdNsn42n7ZYXPxS-z8-8dbUfb9lV9MeRlv9csP1zsVu9iPJtvVnlpWhT0JOI0aOSKWIDxhLFJqDVupFNyCT54JzUOjqFqKLThqJsCG2wjwFNAGdBLdjdX7closPX0H764XSQIE0mjVG_rS9PKQ
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ISPASS57527.2023.00047
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798350397390
EndPage 345
ExternalDocumentID 10158155
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i204t-ffad312ddb057eefbcd744b1bc81eac99144f93dd3f945ef1bed7c76cd5c09703
IEDL.DBID RIE
IngestDate Wed Aug 27 02:51:12 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i204t-ffad312ddb057eefbcd744b1bc81eac99144f93dd3f945ef1bed7c76cd5c09703
PageCount 3
ParticipantIDs ieee_primary_10158155
PublicationCentury 2000
PublicationDate 2023-April
PublicationDateYYYYMMDD 2023-04-01
PublicationDate_xml – month: 04
  year: 2023
  text: 2023-April
PublicationDecade 2020
PublicationTitle 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
PublicationTitleAbbrev ISPASS
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.8510747
Snippet Deep neural networks (DNNs) have become increasingly popular in many domains as they reduce the requirement for human effort. However, today's DNN applications...
SourceID ieee
SourceType Publisher
StartPage 343
SubjectTerms Deep Neural Networks
Graphics Processing Units
Performance Model
Title A Regression-based Model for End-to-End Latency Prediction for DNN Execution on GPUs
URI https://ieeexplore.ieee.org/document/10158155
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fS8MwEA66J59UnPibPPiaurRp0zwO3ZyiZbgN9jaSXCqidKIdqH-9l3RTFASh0JKUttwl_b42990RcipVisAmNePWGiZ4DCzPlWRxnnGbeSVoSNd0W2SDibieptOlWD1oYZxzIfjMRf4wrOXD3C78rzKc4TzNEQDXyTqOs0astVT98o46uxoNu6MR8o9YRr4qeMjE-bNsSkCN_iYpVvdrgkUeo0VtIvvxKxXjvx9oi7S_BXp0-AU922TNVTtk3KV37r6Ja62YhyegvtTZE0ViSnsVsHrOcEdvtCfK73gFv0rjPRPOuCgK2ntzNgxFitvlcPLaJpN-b3w-YMuiCewh7oialaWGBE0OBpmYc6WxIIUw3Nic40sW6aAQpUoAklKJ1JXcOJBWZhZS21E4_3dJq5pXbo9QzRX2y8SIWONXitFZbCUgngMgjTF6n7S9SWbPTV6M2coaB3-0H5IN75Ym7uWItOqXhTtGSK_NSXDlJ25Foaw
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1dS8MwFA06H_RJxYnf5sHX1KVNm-Zx6OamWxlug72NfFWG0ol2oP56b9JNURCEQkta2nBvm3PS3HMvQhdcxABsXBKqtSKMhoakqeAkTBOqE6cE9ema-lnSGbPbSTxZitW9FsZa64PPbOAO_Vq-meuF-1UGXziNUwDAdbQBwM_iSq611P3ShrjsDgfN4RAYSMgDVxfc5-L8WTjF40Z7G2WrJ1bhIo_BolSB_viVjPHfXdpB9W-JHh58gc8uWrPFHho18b19qCJbC-IAymBX7OwJAzXFrcKQck5gh3vSUeV3uINbp3G-8VdcZxluvVntX0YM281g_FpH43ZrdNUhy7IJZBY2WEnyXJoIjG4UcDFrc6UNZ0xRpVMKwywQQsZyERkT5YLFNqfKGq55ok2sGwJGgH1UK-aFPUBYUgHneaRYKGGeomQSam4A0Y0BIqPkIao7k0yfq8wY05U1jv5oP0ebnVG_N-11s7tjtOVcVEXBnKBa-bKwpwDwpTrzbv0EFEyk-Q
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+IEEE+International+Symposium+on+Performance+Analysis+of+Systems+and+Software+%28ISPASS%29&rft.atitle=A+Regression-based+Model+for+End-to-End+Latency+Prediction+for+DNN+Execution+on+GPUs&rft.au=Li%2C+Ying&rft.au=Sun%2C+Yifan&rft.au=Jog%2C+Adwait&rft.date=2023-04-01&rft.pub=IEEE&rft.spage=343&rft.epage=345&rft_id=info:doi/10.1109%2FISPASS57527.2023.00047&rft.externalDocID=10158155