A Hybrid Approach for Optimizing Parallel Clustering Throughput using the GPU

We introduce Hybrid-Dbscan , that uses the GPU and CPUs for optimizing clustering throughput. The main idea is to exploit the memory bandwidth on the GPU for fast index searches, and optimize data transfers between host and GPU, to alleviate the potential negative performance impact of the PCIe inte...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on parallel and distributed systems Vol. 30; no. 4; pp. 766 - 777
Main Authors Gowanlock, Michael, Rude, Cody M., Blair, David M., Li, Justin D., Pankratius, Victor
Format Journal Article
LanguageEnglish
Published New York IEEE 01.04.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…
Abstract We introduce Hybrid-Dbscan , that uses the GPU and CPUs for optimizing clustering throughput. The main idea is to exploit the memory bandwidth on the GPU for fast index searches, and optimize data transfers between host and GPU, to alleviate the potential negative performance impact of the PCIe interconnect. We propose and compare two GPU kernels that exploit grid-based indexing schemes to improve neighborhood search performance. We employ a batching scheme for host-GPU data transfers to obviate limited GPU memory, and exploit concurrent operations on the host and GPU. This scheme is robust with respect to both sparse and dense data distributions and avoids buffer overflows that would otherwise degrade performance. We evaluate our approaches on ionospheric total electron content datasets as well as intermediate-redshift galaxies from the Sloan Digital Sky Survey. Hybrid-Dbscan outperforms the reference implementation across a range of application scenarios, including small workloads, which typically are the domain of CPU-only algorithms. We advance an empirical response time performance model of Hybrid-Dbscan by utilizing the underlying properties of the datasets. With only a single execution of Hybrid-Dbscan on a dataset, we are able to accurately predict the response time for a range of <inline-formula><tex-math notation="LaTeX">\epsilon</tex-math> <mml:math> <mml:mi>ε</mml:mi> </mml:math> <inline-graphic xlink:href="gowanlock-ieq1-2869777.gif"/> <mml:math> <mml:mi>ε</mml:mi> </mml:math> <inline-graphic xlink:href="gowanlock-ieq1-2869777.gif"/> </inline-formula> search distances.
AbstractList We introduce Hybrid-Dbscan , that uses the GPU and CPUs for optimizing clustering throughput. The main idea is to exploit the memory bandwidth on the GPU for fast index searches, and optimize data transfers between host and GPU, to alleviate the potential negative performance impact of the PCIe interconnect. We propose and compare two GPU kernels that exploit grid-based indexing schemes to improve neighborhood search performance. We employ a batching scheme for host-GPU data transfers to obviate limited GPU memory, and exploit concurrent operations on the host and GPU. This scheme is robust with respect to both sparse and dense data distributions and avoids buffer overflows that would otherwise degrade performance. We evaluate our approaches on ionospheric total electron content datasets as well as intermediate-redshift galaxies from the Sloan Digital Sky Survey. Hybrid-Dbscan outperforms the reference implementation across a range of application scenarios, including small workloads, which typically are the domain of CPU-only algorithms. We advance an empirical response time performance model of Hybrid-Dbscan by utilizing the underlying properties of the datasets. With only a single execution of Hybrid-Dbscan on a dataset, we are able to accurately predict the response time for a range of [Formula Omitted] search distances.
We introduce Hybrid-Dbscan , that uses the GPU and CPUs for optimizing clustering throughput. The main idea is to exploit the memory bandwidth on the GPU for fast index searches, and optimize data transfers between host and GPU, to alleviate the potential negative performance impact of the PCIe interconnect. We propose and compare two GPU kernels that exploit grid-based indexing schemes to improve neighborhood search performance. We employ a batching scheme for host-GPU data transfers to obviate limited GPU memory, and exploit concurrent operations on the host and GPU. This scheme is robust with respect to both sparse and dense data distributions and avoids buffer overflows that would otherwise degrade performance. We evaluate our approaches on ionospheric total electron content datasets as well as intermediate-redshift galaxies from the Sloan Digital Sky Survey. Hybrid-Dbscan outperforms the reference implementation across a range of application scenarios, including small workloads, which typically are the domain of CPU-only algorithms. We advance an empirical response time performance model of Hybrid-Dbscan by utilizing the underlying properties of the datasets. With only a single execution of Hybrid-Dbscan on a dataset, we are able to accurately predict the response time for a range of <inline-formula><tex-math notation="LaTeX">\epsilon</tex-math> <mml:math> <mml:mi>ε</mml:mi> </mml:math> <inline-graphic xlink:href="gowanlock-ieq1-2869777.gif"/> <mml:math> <mml:mi>ε</mml:mi> </mml:math> <inline-graphic xlink:href="gowanlock-ieq1-2869777.gif"/> </inline-formula> search distances.
Author Gowanlock, Michael
Li, Justin D.
Rude, Cody M.
Blair, David M.
Pankratius, Victor
Author_xml – sequence: 1
  givenname: Michael
  orcidid: 0000-0002-0826-6204
  surname: Gowanlock
  fullname: Gowanlock, Michael
  email: michael.gowanlock@nau.edu
  organization: School of Informatics, Computing & Cyber Systems, Northern Arizona University, Flagstaff, AZ, USA
– sequence: 2
  givenname: Cody M.
  orcidid: 0000-0002-9584-2600
  surname: Rude
  fullname: Rude, Cody M.
  email: cmrude@mit.edu
  organization: Massachusetts Institute of Technology, Cambridge, MA, USA
– sequence: 3
  givenname: David M.
  surname: Blair
  fullname: Blair, David M.
  email: david_blair@brown.edu
  organization: Department of Chemistry and the Department of Earth, Environmental and Planetary Sciences, Brown University, Providence, RI, USA
– sequence: 4
  givenname: Justin D.
  orcidid: 0000-0003-3315-2038
  surname: Li
  fullname: Li, Justin D.
  email: jdli@alumni.stanford.edu
  organization: AAAS Science & Technology Policy Fellow, Washington, DC, USA
– sequence: 5
  givenname: Victor
  orcidid: 0000-0002-4658-6583
  surname: Pankratius
  fullname: Pankratius, Victor
  email: pankrat@mit.edu
  organization: Massachusetts Institute of Technology, Cambridge, MA, USA
BookMark eNo9UEFuwjAQtCoqFWgfUPViqedQr2M79hHRApWoQCqcLRMcEhSS1E4O9PV1BOppV6OZ2dkZoUFVVxahZyATAKLetpv37wklICdUCpUkyR0aAucyoiDjQdgJ45GioB7QyPsTIcA4YUP0NcXLy94VBzxtGlebNMdZ7fC6aYtz8VtUR7wxzpSlLfGs7HxrXY9tc1d3x7zpWtz5Hmhzixeb3SO6z0zp7dNtjtFu_rGdLaPVevE5m66iNI5FG0GSJrEBQwTNuKKcM2UYSMH5gSthaAbE7qUJ8W0mCFMCGNlLRUgK1Fpq4zF6vfqGyD-d9a0-1Z2rwkkdXgQaXJgMLLiyUld772ymG1ecjbtoILpvTfet6b41fWstaF6umsJa-8-XTNBEqvgPM_VoyQ
CODEN ITDSEO
CitedBy_id crossref_primary_10_1016_j_jpdc_2022_06_005
crossref_primary_10_1016_j_knosys_2022_108501
crossref_primary_10_1016_j_ins_2021_08_036
crossref_primary_10_1109_TETC_2020_3048671
Cites_doi 10.1145/2503210.2503262
10.1016/j.procs.2013.05.200
10.1145/602259.602266
10.1007/978-3-642-03722-1_3
10.1145/304182.304187
10.1109/TPDS.2014.2347041
10.1007/978-3-319-25087-8_25
10.1007/s11704-013-3158-3
10.1109/IPDPS.2017.17
10.1109/TPDS.2017.2675421
10.1145/1645953.1646038
10.1109/IPDPS.2009.5161068
10.1109/ICIME.2010.5477926
10.1145/1964179.1964184
10.1515/9781400874668
10.1088/0067-0049/219/1/12
10.1109/SISAP.2009.9
10.1145/2304576.2304621
10.1145/1376616.1376670
10.1109/MM.2017.37
10.1016/j.jco.2009.02.011
10.1109/SC.2014.51
10.1145/2390226.2390229
10.1109/SC.2012.9
10.1109/TPDS.2015.2500896
10.1109/IPDPS.2016.10
10.1109/IPDPS.2015.24
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TPDS.2018.2869777
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library Online
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2183
EndPage 777
ExternalDocumentID 10_1109_TPDS_2018_2869777
8462789
Genre orig-research
GrantInformation_xml – fundername: National Science Foundation
  grantid: ACI-1442997
  funderid: 10.13039/100000001
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AASAJ
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AKJIK
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIC
RIE
RIG
RNS
TN5
TWZ
UHB
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c336t-17c73a1a062f5925549a418655d596a2f10eb8a558ef60496140b8900c12ee2e3
IEDL.DBID RIE
ISSN 1045-9219
IngestDate Thu Oct 10 18:13:27 EDT 2024
Fri Aug 23 04:42:09 EDT 2024
Wed Jun 26 19:27:53 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c336t-17c73a1a062f5925549a418655d596a2f10eb8a558ef60496140b8900c12ee2e3
ORCID 0000-0002-9584-2600
0000-0002-0826-6204
0000-0003-3315-2038
0000-0002-4658-6583
OpenAccessLink https://doi.org/10.1109/tpds.2018.2869777
PQID 2191259648
PQPubID 85437
PageCount 12
ParticipantIDs proquest_journals_2191259648
crossref_primary_10_1109_TPDS_2018_2869777
ieee_primary_8462789
PublicationCentury 2000
PublicationDate 2019-04-01
PublicationDateYYYYMMDD 2019-04-01
PublicationDate_xml – month: 04
  year: 2019
  text: 2019-04-01
  day: 01
PublicationDecade 2010
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2019
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref34
böhm (ref25) 2009
ref15
ref14
pankratius (ref29) 2015
ref30
ref33
ref11
ref32
ref10
ref2
ref17
ref16
ref18
(ref31) 0
ester (ref1) 1996
ref23
cal (ref12) 2013
he (ref9) 2014; 8
bell (ref24) 2011
ref22
ref21
ref28
zhang (ref20) 2012
böhm (ref26) 2009
ref27
ref8
ref7
böhm (ref19) 2009
ref4
ref3
ref6
ref5
References_xml – ident: ref6
  doi: 10.1145/2503210.2503262
– ident: ref5
  doi: 10.1016/j.procs.2013.05.200
– ident: ref3
  doi: 10.1145/602259.602266
– start-page: 63
  year: 2009
  ident: ref25
  article-title: Data mining using graphics processing units
  publication-title: Transactions on Large-Scale Data- and Knowledge-Centered Systems I
  doi: 10.1007/978-3-642-03722-1_3
  contributor:
    fullname: böhm
– ident: ref32
  doi: 10.1145/304182.304187
– ident: ref17
  doi: 10.1109/TPDS.2014.2347041
– ident: ref13
  doi: 10.1007/978-3-319-25087-8_25
– volume: 8
  start-page: 83
  year: 2014
  ident: ref9
  article-title: MR-DBSCAN: A scalable MapReduce-based DBSCAN algorithm for heavily skewed data
  publication-title: Frontiers Comput Sci
  doi: 10.1007/s11704-013-3158-3
  contributor:
    fullname: he
– ident: ref2
  doi: 10.1109/IPDPS.2017.17
– ident: ref18
  doi: 10.1109/TPDS.2017.2675421
– ident: ref4
  doi: 10.1145/1645953.1646038
– ident: ref28
  doi: 10.1109/IPDPS.2009.5161068
– start-page: 793
  year: 2013
  ident: ref12
  article-title: Data preprocessing with GPU for DBSCAN algorithm
  publication-title: Proc 8th Int Conf Comput Recog Syst
  contributor:
    fullname: cal
– year: 0
  ident: ref31
– ident: ref7
  doi: 10.1109/ICIME.2010.5477926
– ident: ref23
  doi: 10.1145/1964179.1964184
– start-page: 57
  year: 2009
  ident: ref19
  article-title: Index-supported similarity join on graphics processors
  publication-title: Proc of Intl Conf on Database Systems for Business Technology and Web
  contributor:
    fullname: böhm
– ident: ref34
  doi: 10.1515/9781400874668
– ident: ref30
  doi: 10.1088/0067-0049/219/1/12
– ident: ref16
  doi: 10.1109/SISAP.2009.9
– start-page: 1
  year: 2015
  ident: ref29
  publication-title: GPS Data Processing for Scientific Studies of the Earth's Atmosphere and Near-Space Environment
  contributor:
    fullname: pankratius
– ident: ref14
  doi: 10.1145/2304576.2304621
– start-page: 226
  year: 1996
  ident: ref1
  article-title: A density-based algorithm for discovering clusters in large spatial databases with noise
  publication-title: Proc Int'l Conf Knowledge Discovery and Data Mining
  contributor:
    fullname: ester
– ident: ref27
  doi: 10.1145/1376616.1376670
– year: 2011
  ident: ref24
  article-title: Thrust: A productivity-oriented library for CUDA
  publication-title: GPU Computing Gems - Jade Edition
  contributor:
    fullname: bell
– ident: ref33
  doi: 10.1109/MM.2017.37
– ident: ref15
  doi: 10.1016/j.jco.2009.02.011
– ident: ref10
  doi: 10.1109/SC.2014.51
– start-page: 57
  year: 2009
  ident: ref26
  article-title: Index-supported similarity join on graphics processors
  publication-title: Proc of Intl Conf on Database Systems for Business Technology and Web
  contributor:
    fullname: böhm
– start-page: 5
  year: 2012
  ident: ref20
  article-title: U$^2$ 2 2 STRA: High-performance data management of ubiquitous urban sensing trajectories on GPGPUs
  publication-title: Proc of the ACM Workshop on City Data Management
  doi: 10.1145/2390226.2390229
  contributor:
    fullname: zhang
– ident: ref8
  doi: 10.1109/SC.2012.9
– ident: ref22
  doi: 10.1109/TPDS.2015.2500896
– ident: ref11
  doi: 10.1109/IPDPS.2016.10
– ident: ref21
  doi: 10.1109/IPDPS.2015.24
SSID ssj0014504
Score 2.3905199
Snippet We introduce Hybrid-Dbscan , that uses the GPU and CPUs for optimizing clustering throughput. The main idea is to exploit the memory bandwidth on the GPU for...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Publisher
StartPage 766
SubjectTerms Algorithms
Clustering
Clustering algorithms
Datasets
DBSCAN
Galaxies
GPGPU
Graphics processing units
in-memory database
Indexing
Kernel
Optimization
parallel clustering
Performance degradation
query optimization
Red shift
Response time (computers)
Sky surveys (astronomy)
spatial databases
Throughput
Time factors
Title A Hybrid Approach for Optimizing Parallel Clustering Throughput using the GPU
URI https://ieeexplore.ieee.org/document/8462789
https://www.proquest.com/docview/2191259648
Volume 30
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFH8BTnoQBY0omh48GTe20XbtkaBITFASIeG2dKMzRgSD4yB_va_dIH4dvC1L1zT99fX93t4XwAXVAZ8yxZw4DYybEQ0UEXPqIHnW1Et8Tytb7fOe98f0bsImJbja5sJorW3wmXbNo_XlTxfJyvwqa6GuNImbZSgLL8hztbYeA8psq0C0LpgjUQwLD6bvydZoeP1ogriEGwiOfCf8poNsU5VfN7FVL70qDDYLy6NKXtxVFrvJ-kfNxv-ufB_2Cp5JOvnBOICSnteguunhQAqRrsHul4KEdRh0SP_D5HCRTlFrnCCpJQ94r7w-r3EIGaql6b4yI93ZytRYMO9Gea8fnJmYMPongqSS3A7HhzDu3Yy6fadouOAk7TbPHD9MwrbylceDlEk0NqhU1Depq1MmuQpShC4WijGhU46mBap2LxbSQ0wDrQPdPoLKfDHXx0CSGO0-EcZUaIEWp5JcsqmfhpSntoZdAy43EERveV2NyNojnowMXpHBKyrwakDdbOl2YLGbDWhuQIsKyXuPEHrkbJJTcfL3V6ewg3PLPPqmCZVsudJnSCyy-NyeqE-zqMew
link.rule.ids 315,786,790,802,27955,27956,55107
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLYGHIADjwFiMCAHToiOPpI0OU68CmxjEpu0W5W2KUKMgWA7wK_HSbuJ14FbVaVtlC-O7dr-DHBItc8zppiT5L4JM6KDIhJOHTSeNXVTz9XKsn12eNSn1wM2qMDxrBZGa22Tz3TDXNpYfvacTsyvshPUlaZwcw4WUM-7YVGtNYsZUGabBaJ_wRyJgljGMD1XnvS6Z3cmjUs0fMHR4gm_aSHbVuXXWWwVzMUqtKdTK_JKHhuTcdJIP36wNv537muwUlqapFlsjXWo6FEVVqddHEgp1FVY_kJJuAHtJoneTRUXaZZs4wTNWnKLJ8vTwwcOIV31avqvDMnpcGJYFsy9XtHtB99MTCL9PUGzklx2-5vQvzjvnUZO2XLBSYOAjx0vTMNAecrlfs4kuhtUKuqZ4tWMSa78HMFLhGJM6Jyjc4HK3U2EdBFVX2tfB1swP3oe6W0gaYKenwgTKrRAn1NJLlnm5SHluWWxq8HRFIL4pWDWiK1H4srY4BUbvOISrxpsmCWdDSxXswb1KWhxKXtvMUKPVpvkVOz8_dQBLEa9dituXXVudmEJvyOLXJw6zI9fJ3oPzYxxsm931ydnyMsE
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Hybrid+Approach+for+Optimizing+Parallel+Clustering+Throughput+using+the+GPU&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Gowanlock%2C+Michael&rft.au=Rude%2C+Cody+M&rft.au=Blair%2C+David+M&rft.au=Li%2C+Justin+D&rft.date=2019-04-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=30&rft.issue=4&rft.spage=766&rft_id=info:doi/10.1109%2FTPDS.2018.2869777&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon