A Hybrid Approach for Optimizing Parallel Clustering Throughput using the GPU
We introduce Hybrid-Dbscan , that uses the GPU and CPUs for optimizing clustering throughput. The main idea is to exploit the memory bandwidth on the GPU for fast index searches, and optimize data transfers between host and GPU, to alleviate the potential negative performance impact of the PCIe inte...
Saved in:
Published in | IEEE transactions on parallel and distributed systems Vol. 30; no. 4; pp. 766 - 777 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.04.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | We introduce Hybrid-Dbscan , that uses the GPU and CPUs for optimizing clustering throughput. The main idea is to exploit the memory bandwidth on the GPU for fast index searches, and optimize data transfers between host and GPU, to alleviate the potential negative performance impact of the PCIe interconnect. We propose and compare two GPU kernels that exploit grid-based indexing schemes to improve neighborhood search performance. We employ a batching scheme for host-GPU data transfers to obviate limited GPU memory, and exploit concurrent operations on the host and GPU. This scheme is robust with respect to both sparse and dense data distributions and avoids buffer overflows that would otherwise degrade performance. We evaluate our approaches on ionospheric total electron content datasets as well as intermediate-redshift galaxies from the Sloan Digital Sky Survey. Hybrid-Dbscan outperforms the reference implementation across a range of application scenarios, including small workloads, which typically are the domain of CPU-only algorithms. We advance an empirical response time performance model of Hybrid-Dbscan by utilizing the underlying properties of the datasets. With only a single execution of Hybrid-Dbscan on a dataset, we are able to accurately predict the response time for a range of <inline-formula><tex-math notation="LaTeX">\epsilon</tex-math> <mml:math> <mml:mi>ε</mml:mi> </mml:math> <inline-graphic xlink:href="gowanlock-ieq1-2869777.gif"/> <mml:math> <mml:mi>ε</mml:mi> </mml:math> <inline-graphic xlink:href="gowanlock-ieq1-2869777.gif"/> </inline-formula> search distances. |
---|---|
AbstractList | We introduce Hybrid-Dbscan , that uses the GPU and CPUs for optimizing clustering throughput. The main idea is to exploit the memory bandwidth on the GPU for fast index searches, and optimize data transfers between host and GPU, to alleviate the potential negative performance impact of the PCIe interconnect. We propose and compare two GPU kernels that exploit grid-based indexing schemes to improve neighborhood search performance. We employ a batching scheme for host-GPU data transfers to obviate limited GPU memory, and exploit concurrent operations on the host and GPU. This scheme is robust with respect to both sparse and dense data distributions and avoids buffer overflows that would otherwise degrade performance. We evaluate our approaches on ionospheric total electron content datasets as well as intermediate-redshift galaxies from the Sloan Digital Sky Survey. Hybrid-Dbscan outperforms the reference implementation across a range of application scenarios, including small workloads, which typically are the domain of CPU-only algorithms. We advance an empirical response time performance model of Hybrid-Dbscan by utilizing the underlying properties of the datasets. With only a single execution of Hybrid-Dbscan on a dataset, we are able to accurately predict the response time for a range of [Formula Omitted] search distances. We introduce Hybrid-Dbscan , that uses the GPU and CPUs for optimizing clustering throughput. The main idea is to exploit the memory bandwidth on the GPU for fast index searches, and optimize data transfers between host and GPU, to alleviate the potential negative performance impact of the PCIe interconnect. We propose and compare two GPU kernels that exploit grid-based indexing schemes to improve neighborhood search performance. We employ a batching scheme for host-GPU data transfers to obviate limited GPU memory, and exploit concurrent operations on the host and GPU. This scheme is robust with respect to both sparse and dense data distributions and avoids buffer overflows that would otherwise degrade performance. We evaluate our approaches on ionospheric total electron content datasets as well as intermediate-redshift galaxies from the Sloan Digital Sky Survey. Hybrid-Dbscan outperforms the reference implementation across a range of application scenarios, including small workloads, which typically are the domain of CPU-only algorithms. We advance an empirical response time performance model of Hybrid-Dbscan by utilizing the underlying properties of the datasets. With only a single execution of Hybrid-Dbscan on a dataset, we are able to accurately predict the response time for a range of <inline-formula><tex-math notation="LaTeX">\epsilon</tex-math> <mml:math> <mml:mi>ε</mml:mi> </mml:math> <inline-graphic xlink:href="gowanlock-ieq1-2869777.gif"/> <mml:math> <mml:mi>ε</mml:mi> </mml:math> <inline-graphic xlink:href="gowanlock-ieq1-2869777.gif"/> </inline-formula> search distances. |
Author | Gowanlock, Michael Li, Justin D. Rude, Cody M. Blair, David M. Pankratius, Victor |
Author_xml | – sequence: 1 givenname: Michael orcidid: 0000-0002-0826-6204 surname: Gowanlock fullname: Gowanlock, Michael email: michael.gowanlock@nau.edu organization: School of Informatics, Computing & Cyber Systems, Northern Arizona University, Flagstaff, AZ, USA – sequence: 2 givenname: Cody M. orcidid: 0000-0002-9584-2600 surname: Rude fullname: Rude, Cody M. email: cmrude@mit.edu organization: Massachusetts Institute of Technology, Cambridge, MA, USA – sequence: 3 givenname: David M. surname: Blair fullname: Blair, David M. email: david_blair@brown.edu organization: Department of Chemistry and the Department of Earth, Environmental and Planetary Sciences, Brown University, Providence, RI, USA – sequence: 4 givenname: Justin D. orcidid: 0000-0003-3315-2038 surname: Li fullname: Li, Justin D. email: jdli@alumni.stanford.edu organization: AAAS Science & Technology Policy Fellow, Washington, DC, USA – sequence: 5 givenname: Victor orcidid: 0000-0002-4658-6583 surname: Pankratius fullname: Pankratius, Victor email: pankrat@mit.edu organization: Massachusetts Institute of Technology, Cambridge, MA, USA |
BookMark | eNo9UEFuwjAQtCoqFWgfUPViqedQr2M79hHRApWoQCqcLRMcEhSS1E4O9PV1BOppV6OZ2dkZoUFVVxahZyATAKLetpv37wklICdUCpUkyR0aAucyoiDjQdgJ45GioB7QyPsTIcA4YUP0NcXLy94VBzxtGlebNMdZ7fC6aYtz8VtUR7wxzpSlLfGs7HxrXY9tc1d3x7zpWtz5Hmhzixeb3SO6z0zp7dNtjtFu_rGdLaPVevE5m66iNI5FG0GSJrEBQwTNuKKcM2UYSMH5gSthaAbE7qUJ8W0mCFMCGNlLRUgK1Fpq4zF6vfqGyD-d9a0-1Z2rwkkdXgQaXJgMLLiyUld772ymG1ecjbtoILpvTfet6b41fWstaF6umsJa-8-XTNBEqvgPM_VoyQ |
CODEN | ITDSEO |
CitedBy_id | crossref_primary_10_1016_j_jpdc_2022_06_005 crossref_primary_10_1016_j_knosys_2022_108501 crossref_primary_10_1016_j_ins_2021_08_036 crossref_primary_10_1109_TETC_2020_3048671 |
Cites_doi | 10.1145/2503210.2503262 10.1016/j.procs.2013.05.200 10.1145/602259.602266 10.1007/978-3-642-03722-1_3 10.1145/304182.304187 10.1109/TPDS.2014.2347041 10.1007/978-3-319-25087-8_25 10.1007/s11704-013-3158-3 10.1109/IPDPS.2017.17 10.1109/TPDS.2017.2675421 10.1145/1645953.1646038 10.1109/IPDPS.2009.5161068 10.1109/ICIME.2010.5477926 10.1145/1964179.1964184 10.1515/9781400874668 10.1088/0067-0049/219/1/12 10.1109/SISAP.2009.9 10.1145/2304576.2304621 10.1145/1376616.1376670 10.1109/MM.2017.37 10.1016/j.jco.2009.02.011 10.1109/SC.2014.51 10.1145/2390226.2390229 10.1109/SC.2012.9 10.1109/TPDS.2015.2500896 10.1109/IPDPS.2016.10 10.1109/IPDPS.2015.24 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2019 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/TPDS.2018.2869777 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library Online CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library Online url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Computer Science |
EISSN | 1558-2183 |
EndPage | 777 |
ExternalDocumentID | 10_1109_TPDS_2018_2869777 8462789 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Science Foundation grantid: ACI-1442997 funderid: 10.13039/100000001 |
GroupedDBID | --Z -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AASAJ ABQJQ ABVLG ACGFO ACIWK AENEX AKJIK ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIC RIE RIG RNS TN5 TWZ UHB AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c336t-17c73a1a062f5925549a418655d596a2f10eb8a558ef60496140b8900c12ee2e3 |
IEDL.DBID | RIE |
ISSN | 1045-9219 |
IngestDate | Thu Oct 10 18:13:27 EDT 2024 Fri Aug 23 04:42:09 EDT 2024 Wed Jun 26 19:27:53 EDT 2024 |
IsDoiOpenAccess | false |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 4 |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c336t-17c73a1a062f5925549a418655d596a2f10eb8a558ef60496140b8900c12ee2e3 |
ORCID | 0000-0002-9584-2600 0000-0002-0826-6204 0000-0003-3315-2038 0000-0002-4658-6583 |
OpenAccessLink | https://doi.org/10.1109/tpds.2018.2869777 |
PQID | 2191259648 |
PQPubID | 85437 |
PageCount | 12 |
ParticipantIDs | proquest_journals_2191259648 crossref_primary_10_1109_TPDS_2018_2869777 ieee_primary_8462789 |
PublicationCentury | 2000 |
PublicationDate | 2019-04-01 |
PublicationDateYYYYMMDD | 2019-04-01 |
PublicationDate_xml | – month: 04 year: 2019 text: 2019-04-01 day: 01 |
PublicationDecade | 2010 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on parallel and distributed systems |
PublicationTitleAbbrev | TPDS |
PublicationYear | 2019 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref13 ref34 böhm (ref25) 2009 ref15 ref14 pankratius (ref29) 2015 ref30 ref33 ref11 ref32 ref10 ref2 ref17 ref16 ref18 (ref31) 0 ester (ref1) 1996 ref23 cal (ref12) 2013 he (ref9) 2014; 8 bell (ref24) 2011 ref22 ref21 ref28 zhang (ref20) 2012 böhm (ref26) 2009 ref27 ref8 ref7 böhm (ref19) 2009 ref4 ref3 ref6 ref5 |
References_xml | – ident: ref6 doi: 10.1145/2503210.2503262 – ident: ref5 doi: 10.1016/j.procs.2013.05.200 – ident: ref3 doi: 10.1145/602259.602266 – start-page: 63 year: 2009 ident: ref25 article-title: Data mining using graphics processing units publication-title: Transactions on Large-Scale Data- and Knowledge-Centered Systems I doi: 10.1007/978-3-642-03722-1_3 contributor: fullname: böhm – ident: ref32 doi: 10.1145/304182.304187 – ident: ref17 doi: 10.1109/TPDS.2014.2347041 – ident: ref13 doi: 10.1007/978-3-319-25087-8_25 – volume: 8 start-page: 83 year: 2014 ident: ref9 article-title: MR-DBSCAN: A scalable MapReduce-based DBSCAN algorithm for heavily skewed data publication-title: Frontiers Comput Sci doi: 10.1007/s11704-013-3158-3 contributor: fullname: he – ident: ref2 doi: 10.1109/IPDPS.2017.17 – ident: ref18 doi: 10.1109/TPDS.2017.2675421 – ident: ref4 doi: 10.1145/1645953.1646038 – ident: ref28 doi: 10.1109/IPDPS.2009.5161068 – start-page: 793 year: 2013 ident: ref12 article-title: Data preprocessing with GPU for DBSCAN algorithm publication-title: Proc 8th Int Conf Comput Recog Syst contributor: fullname: cal – year: 0 ident: ref31 – ident: ref7 doi: 10.1109/ICIME.2010.5477926 – ident: ref23 doi: 10.1145/1964179.1964184 – start-page: 57 year: 2009 ident: ref19 article-title: Index-supported similarity join on graphics processors publication-title: Proc of Intl Conf on Database Systems for Business Technology and Web contributor: fullname: böhm – ident: ref34 doi: 10.1515/9781400874668 – ident: ref30 doi: 10.1088/0067-0049/219/1/12 – ident: ref16 doi: 10.1109/SISAP.2009.9 – start-page: 1 year: 2015 ident: ref29 publication-title: GPS Data Processing for Scientific Studies of the Earth's Atmosphere and Near-Space Environment contributor: fullname: pankratius – ident: ref14 doi: 10.1145/2304576.2304621 – start-page: 226 year: 1996 ident: ref1 article-title: A density-based algorithm for discovering clusters in large spatial databases with noise publication-title: Proc Int'l Conf Knowledge Discovery and Data Mining contributor: fullname: ester – ident: ref27 doi: 10.1145/1376616.1376670 – year: 2011 ident: ref24 article-title: Thrust: A productivity-oriented library for CUDA publication-title: GPU Computing Gems - Jade Edition contributor: fullname: bell – ident: ref33 doi: 10.1109/MM.2017.37 – ident: ref15 doi: 10.1016/j.jco.2009.02.011 – ident: ref10 doi: 10.1109/SC.2014.51 – start-page: 57 year: 2009 ident: ref26 article-title: Index-supported similarity join on graphics processors publication-title: Proc of Intl Conf on Database Systems for Business Technology and Web contributor: fullname: böhm – start-page: 5 year: 2012 ident: ref20 article-title: U$^2$ 2 2 STRA: High-performance data management of ubiquitous urban sensing trajectories on GPGPUs publication-title: Proc of the ACM Workshop on City Data Management doi: 10.1145/2390226.2390229 contributor: fullname: zhang – ident: ref8 doi: 10.1109/SC.2012.9 – ident: ref22 doi: 10.1109/TPDS.2015.2500896 – ident: ref11 doi: 10.1109/IPDPS.2016.10 – ident: ref21 doi: 10.1109/IPDPS.2015.24 |
SSID | ssj0014504 |
Score | 2.3905199 |
Snippet | We introduce Hybrid-Dbscan , that uses the GPU and CPUs for optimizing clustering throughput. The main idea is to exploit the memory bandwidth on the GPU for... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Publisher |
StartPage | 766 |
SubjectTerms | Algorithms Clustering Clustering algorithms Datasets DBSCAN Galaxies GPGPU Graphics processing units in-memory database Indexing Kernel Optimization parallel clustering Performance degradation query optimization Red shift Response time (computers) Sky surveys (astronomy) spatial databases Throughput Time factors |
Title | A Hybrid Approach for Optimizing Parallel Clustering Throughput using the GPU |
URI | https://ieeexplore.ieee.org/document/8462789 https://www.proquest.com/docview/2191259648 |
Volume | 30 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT8IwFH8BTnoQBY0omh48GTe20XbtkaBITFASIeG2dKMzRgSD4yB_va_dIH4dvC1L1zT99fX93t4XwAXVAZ8yxZw4DYybEQ0UEXPqIHnW1Et8Tytb7fOe98f0bsImJbja5sJorW3wmXbNo_XlTxfJyvwqa6GuNImbZSgLL8hztbYeA8psq0C0LpgjUQwLD6bvydZoeP1ogriEGwiOfCf8poNsU5VfN7FVL70qDDYLy6NKXtxVFrvJ-kfNxv-ufB_2Cp5JOvnBOICSnteguunhQAqRrsHul4KEdRh0SP_D5HCRTlFrnCCpJQ94r7w-r3EIGaql6b4yI93ZytRYMO9Gea8fnJmYMPongqSS3A7HhzDu3Yy6fadouOAk7TbPHD9MwrbylceDlEk0NqhU1Depq1MmuQpShC4WijGhU46mBap2LxbSQ0wDrQPdPoLKfDHXx0CSGO0-EcZUaIEWp5JcsqmfhpSntoZdAy43EERveV2NyNojnowMXpHBKyrwakDdbOl2YLGbDWhuQIsKyXuPEHrkbJJTcfL3V6ewg3PLPPqmCZVsudJnSCyy-NyeqE-zqMew |
link.rule.ids | 315,786,790,802,27955,27956,55107 |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1LT8MwDLYGHIADjwFiMCAHToiOPpI0OU68CmxjEpu0W5W2KUKMgWA7wK_HSbuJ14FbVaVtlC-O7dr-DHBItc8zppiT5L4JM6KDIhJOHTSeNXVTz9XKsn12eNSn1wM2qMDxrBZGa22Tz3TDXNpYfvacTsyvshPUlaZwcw4WUM-7YVGtNYsZUGabBaJ_wRyJgljGMD1XnvS6Z3cmjUs0fMHR4gm_aSHbVuXXWWwVzMUqtKdTK_JKHhuTcdJIP36wNv537muwUlqapFlsjXWo6FEVVqddHEgp1FVY_kJJuAHtJoneTRUXaZZs4wTNWnKLJ8vTwwcOIV31avqvDMnpcGJYFsy9XtHtB99MTCL9PUGzklx2-5vQvzjvnUZO2XLBSYOAjx0vTMNAecrlfs4kuhtUKuqZ4tWMSa78HMFLhGJM6Jyjc4HK3U2EdBFVX2tfB1swP3oe6W0gaYKenwgTKrRAn1NJLlnm5SHluWWxq8HRFIL4pWDWiK1H4srY4BUbvOISrxpsmCWdDSxXswb1KWhxKXtvMUKPVpvkVOz8_dQBLEa9dituXXVudmEJvyOLXJw6zI9fJ3oPzYxxsm931ydnyMsE |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Hybrid+Approach+for+Optimizing+Parallel+Clustering+Throughput+using+the+GPU&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Gowanlock%2C+Michael&rft.au=Rude%2C+Cody+M&rft.au=Blair%2C+David+M&rft.au=Li%2C+Justin+D&rft.date=2019-04-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=30&rft.issue=4&rft.spage=766&rft_id=info:doi/10.1109%2FTPDS.2018.2869777&rft.externalDBID=NO_FULL_TEXT |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon |