A Regression-based Model for End-to-End Latency Prediction for DNN Execution on GPUs
Deep neural networks (DNNs) have become increasingly popular in many domains as they reduce the requirement for human effort. However, today's DNN applications suffer from high computational complexity and sub-optimal device utilization. To solve this problem, researchers have been proposing ne...
Saved in:
Published in | 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) pp. 343 - 345 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.04.2023
|
Subjects | |
Online Access | Get full text |
DOI | 10.1109/ISPASS57527.2023.00047 |
Cover
Abstract | Deep neural networks (DNNs) have become increasingly popular in many domains as they reduce the requirement for human effort. However, today's DNN applications suffer from high computational complexity and sub-optimal device utilization. To solve this problem, researchers have been proposing new system design solutions, which require performance models to help them with pre-product concept validation. This paper discusses how to build a simple, yet accurate, performance model for DNNs on GPUs. Our observations demonstrate prevalent linear relationships between the GPU execution times and operation counts of DNNs layers. Our proposed linear-regression-based execution time predictor can make predictions with an error rate of 28%. 1 1 This material is based upon work supported in part by the Google Research Scholar Award and William & Mary. This work was performed in part using the computing facilities at William & Mary and Google Cloud. This work was done while Jog was with William & Mary. Jog is currently with the University of Virginia. |
---|---|
AbstractList | Deep neural networks (DNNs) have become increasingly popular in many domains as they reduce the requirement for human effort. However, today's DNN applications suffer from high computational complexity and sub-optimal device utilization. To solve this problem, researchers have been proposing new system design solutions, which require performance models to help them with pre-product concept validation. This paper discusses how to build a simple, yet accurate, performance model for DNNs on GPUs. Our observations demonstrate prevalent linear relationships between the GPU execution times and operation counts of DNNs layers. Our proposed linear-regression-based execution time predictor can make predictions with an error rate of 28%. 1 1 This material is based upon work supported in part by the Google Research Scholar Award and William & Mary. This work was performed in part using the computing facilities at William & Mary and Google Cloud. This work was done while Jog was with William & Mary. Jog is currently with the University of Virginia. |
Author | Jog, Adwait Li, Ying Sun, Yifan |
Author_xml | – sequence: 1 givenname: Ying surname: Li fullname: Li, Ying email: yli81@wm.edu organization: William & Mary,USA – sequence: 2 givenname: Yifan surname: Sun fullname: Sun, Yifan email: ysun25@wm.edu organization: William & Mary,USA – sequence: 3 givenname: Adwait surname: Jog fullname: Jog, Adwait email: ajog@virginia.edu organization: William & Mary,USA |
BookMark | eNotjNFKwzAUhiPohc69gUheIPWkSUxzWWadgzqL3a5HmnMihdlKW8G9vWUKP3zw8_HdsMuu74ixewmJlOAeNnWV17WxJrVJCqlKAEDbC7Z01mXKgHJWObhmu5y_08dA49j2nWj8SMhfe6Qjj_3Aiw7F1IsZvPQTdeHEq4GwDdNsn42n7ZYXPxS-z8-8dbUfb9lV9MeRlv9csP1zsVu9iPJtvVnlpWhT0JOI0aOSKWIDxhLFJqDVupFNyCT54JzUOjqFqKLThqJsCG2wjwFNAGdBLdjdX7closPX0H764XSQIE0mjVG_rS9PKQ |
CODEN | IEEPAD |
ContentType | Conference Proceeding |
DBID | 6IE 6IL CBEJK RIE RIL |
DOI | 10.1109/ISPASS57527.2023.00047 |
DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
DatabaseTitleList | |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
EISBN | 9798350397390 |
EndPage | 345 |
ExternalDocumentID | 10158155 |
Genre | orig-research |
GroupedDBID | 6IE 6IL CBEJK RIE RIL |
ID | FETCH-LOGICAL-i204t-ffad312ddb057eefbcd744b1bc81eac99144f93dd3f945ef1bed7c76cd5c09703 |
IEDL.DBID | RIE |
IngestDate | Wed Aug 27 02:51:12 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-i204t-ffad312ddb057eefbcd744b1bc81eac99144f93dd3f945ef1bed7c76cd5c09703 |
PageCount | 3 |
ParticipantIDs | ieee_primary_10158155 |
PublicationCentury | 2000 |
PublicationDate | 2023-April |
PublicationDateYYYYMMDD | 2023-04-01 |
PublicationDate_xml | – month: 04 year: 2023 text: 2023-April |
PublicationDecade | 2020 |
PublicationTitle | 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) |
PublicationTitleAbbrev | ISPASS |
PublicationYear | 2023 |
Publisher | IEEE |
Publisher_xml | – name: IEEE |
Score | 1.8510747 |
Snippet | Deep neural networks (DNNs) have become increasingly popular in many domains as they reduce the requirement for human effort. However, today's DNN applications... |
SourceID | ieee |
SourceType | Publisher |
StartPage | 343 |
SubjectTerms | Deep Neural Networks Graphics Processing Units Performance Model |
Title | A Regression-based Model for End-to-End Latency Prediction for DNN Execution on GPUs |
URI | https://ieeexplore.ieee.org/document/10158155 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3fS8MwEA66J59UnPibPPiaurRp0zwO3ZyiZbgN9jaSXCqidKIdqH-9l3RTFASh0JKUttwl_b42990RcipVisAmNePWGiZ4DCzPlWRxnnGbeSVoSNd0W2SDibieptOlWD1oYZxzIfjMRf4wrOXD3C78rzKc4TzNEQDXyTqOs0astVT98o46uxoNu6MR8o9YRr4qeMjE-bNsSkCN_iYpVvdrgkUeo0VtIvvxKxXjvx9oi7S_BXp0-AU922TNVTtk3KV37r6Ja62YhyegvtTZE0ViSnsVsHrOcEdvtCfK73gFv0rjPRPOuCgK2ntzNgxFitvlcPLaJpN-b3w-YMuiCewh7oialaWGBE0OBpmYc6WxIIUw3Nic40sW6aAQpUoAklKJ1JXcOJBWZhZS21E4_3dJq5pXbo9QzRX2y8SIWONXitFZbCUgngMgjTF6n7S9SWbPTV6M2coaB3-0H5IN75Ym7uWItOqXhTtGSK_NSXDlJ25Foaw |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1dS8MwFA06H_RJxYnf5sHX1KVNm-Zx6OamWxlug72NfFWG0ol2oP56b9JNURCEQkta2nBvm3PS3HMvQhdcxABsXBKqtSKMhoakqeAkTBOqE6cE9ema-lnSGbPbSTxZitW9FsZa64PPbOAO_Vq-meuF-1UGXziNUwDAdbQBwM_iSq611P3ShrjsDgfN4RAYSMgDVxfc5-L8WTjF40Z7G2WrJ1bhIo_BolSB_viVjPHfXdpB9W-JHh58gc8uWrPFHho18b19qCJbC-IAymBX7OwJAzXFrcKQck5gh3vSUeV3uINbp3G-8VdcZxluvVntX0YM281g_FpH43ZrdNUhy7IJZBY2WEnyXJoIjG4UcDFrc6UNZ0xRpVMKwywQQsZyERkT5YLFNqfKGq55ok2sGwJGgH1UK-aFPUBYUgHneaRYKGGeomQSam4A0Y0BIqPkIao7k0yfq8wY05U1jv5oP0ebnVG_N-11s7tjtOVcVEXBnKBa-bKwpwDwpTrzbv0EFEyk-Q |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+IEEE+International+Symposium+on+Performance+Analysis+of+Systems+and+Software+%28ISPASS%29&rft.atitle=A+Regression-based+Model+for+End-to-End+Latency+Prediction+for+DNN+Execution+on+GPUs&rft.au=Li%2C+Ying&rft.au=Sun%2C+Yifan&rft.au=Jog%2C+Adwait&rft.date=2023-04-01&rft.pub=IEEE&rft.spage=343&rft.epage=345&rft_id=info:doi/10.1109%2FISPASS57527.2023.00047&rft.externalDocID=10158155 |