A scalable graph generation algorithm to sample over a given shell distribution

Graphs are commonly used to model the relationships between various entities. These graphs can be enormously large and thus, scalable graph analysis has been the subject of many research efforts. To enable scalable analytics, many researchers have focused on generating realistic graphs that support...

Full description

Saved in:
Bibliographic Details
Published in2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) pp. 227 - 236
Main Authors Ozkaya, M. Yusuf, Balin, M. Fatih, Pinar, Ali, Catalyurek, Umit V.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.05.2020
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Graphs are commonly used to model the relationships between various entities. These graphs can be enormously large and thus, scalable graph analysis has been the subject of many research efforts. To enable scalable analytics, many researchers have focused on generating realistic graphs that support controlled experiments for understanding how algorithms perform under changing graph features. Significant progress has been made on scalable graph generation which preserve some important graph properties (e.g., degree distribution, clustering coefficients). In this paper, we study how to sample a graph from the space of graphs with a given shell distribution. Shell distribution is related to the k-core, which is the largest subgraph where each vertex is connected to at least kother vertices. A k-shell is the subset of vertices that are in k-core but not ( k +1)-core, and the shell distribution comprises the sizes of these shells. Core decompositions are widely used to extract information from graphs and to assist other computations. We present a scalable shared and distributed memory graph generator that, given a shell decomposition, generates a random graph that conforms to it. Our extensive experimental results show the efficiency and scalability of our methods. Our algorithm generates 2 ^{33} vertices and 2 ^{37} edges in less than 50 seconds on 384 cores. 1 1 This work is funded by the Laboratory Directed Research and Development program of Sandia National Laboratories. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525.
AbstractList Graphs are commonly used to model the relationships between various entities. These graphs can be enormously large and thus, scalable graph analysis has been the subject of many research efforts. To enable scalable analytics, many researchers have focused on generating realistic graphs that support controlled experiments for understanding how algorithms perform under changing graph features. Significant progress has been made on scalable graph generation which preserve some important graph properties (e.g., degree distribution, clustering coefficients). In this paper, we study how to sample a graph from the space of graphs with a given shell distribution. Shell distribution is related to the k-core, which is the largest subgraph where each vertex is connected to at least kother vertices. A k-shell is the subset of vertices that are in k-core but not ( k +1)-core, and the shell distribution comprises the sizes of these shells. Core decompositions are widely used to extract information from graphs and to assist other computations. We present a scalable shared and distributed memory graph generator that, given a shell decomposition, generates a random graph that conforms to it. Our extensive experimental results show the efficiency and scalability of our methods. Our algorithm generates 2 ^{33} vertices and 2 ^{37} edges in less than 50 seconds on 384 cores. 1 1 This work is funded by the Laboratory Directed Research and Development program of Sandia National Laboratories. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-NA0003525.
Author Pinar, Ali
Balin, M. Fatih
Catalyurek, Umit V.
Ozkaya, M. Yusuf
Author_xml – sequence: 1
  givenname: M. Yusuf
  surname: Ozkaya
  fullname: Ozkaya, M. Yusuf
  organization: Georgia Institute of Technology,School of Computational Science and Engineeering,Atlanta,GA,USA
– sequence: 2
  givenname: M. Fatih
  surname: Balin
  fullname: Balin, M. Fatih
  organization: Georgia Institute of Technology,School of Computational Science and Engineeering,Atlanta,GA,USA
– sequence: 3
  givenname: Ali
  surname: Pinar
  fullname: Pinar, Ali
  organization: Sandia National Laboratories,Livermore,CA,USA
– sequence: 4
  givenname: Umit V.
  surname: Catalyurek
  fullname: Catalyurek, Umit V.
  organization: Sandia National Laboratories,Livermore,CA,USA
BookMark eNotjM1Kw0AURkfQha0-gSDzAon3JjOdybLUv0KhBRWX5U5yJxnIH5NY8O2t6OLjbM75FuKyH3oW4h4hRYTiYXt4PLx9asggS8-DFAA0XogFmsyiUUqba7Ffy6mkllzLso40NrLmniPNYegltfUQw9x0ch7kRN14loYTR0myDifu5dRw28oqTHMM7uu3uRFXntqJb_-5FB_PT--b12S3f9lu1rskZBrnhAsDlSFbeSwwM0WJznltrVGlJ6e0X0Hh85zQOJ0pzIFBlWhYc8kr6yhfiru_38DMxzGGjuL3sUANykL-A6j7TQk
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/IPDPSW50202.2020.00051
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1728174457
9781728174457
EndPage 236
ExternalDocumentID 9150480
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i251t-e970d7a8df191279c1bbf58874cfab45f609f33a17b524130e04c17e5ece68ba3
IEDL.DBID RIE
IngestDate Mon Jul 08 05:38:32 EDT 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i251t-e970d7a8df191279c1bbf58874cfab45f609f33a17b524130e04c17e5ece68ba3
OpenAccessLink https://www.osti.gov/biblio/1808780
PageCount 10
ParticipantIDs ieee_primary_9150480
PublicationCentury 2000
PublicationDate 2020-May
PublicationDateYYYYMMDD 2020-05-01
PublicationDate_xml – month: 05
  year: 2020
  text: 2020-May
PublicationDecade 2020
PublicationTitle 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
PublicationTitleAbbrev IPDPSW
PublicationYear 2020
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.7304165
Snippet Graphs are commonly used to model the relationships between various entities. These graphs can be enormously large and thus, scalable graph analysis has been...
SourceID ieee
SourceType Publisher
StartPage 227
SubjectTerms distributed algorithms
Generators
graph generation
Histograms
Indexes
Laboratories
Parallel algorithms
Partitioning algorithms
scalable graph algorithms
shared memory
Software algorithms
Title A scalable graph generation algorithm to sample over a given shell distribution
URI https://ieeexplore.ieee.org/document/9150480
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV09T8MwELVKJyZALeJbHhhJ69R2Uo8IqApSoRJUdKvs2C4VtEFtuvDruXNCQYiBLcoS607yvcu9946Qcw55FbHlkeWMRUJoFSmAHZGHaqW5lZ5LFAoP7pP-SNyN5bhGLjZaGOdcIJ-5Fj6GWb7NszX-KmsrQC-iCw36FjRupVarEv3GTLVvh9fDx2cJ-AcFVh2kbDGcP_7YmhKKRm-HDL4-V3JFXlvrwrSyj19OjP89zy5pfsvz6HBTePZIzS0a5OGSriDeqISiwYWaToOjNAae6rdpvpwVL3Na5HSl0RGYIneTajrF646ukA9KLbroVguwmmTUu3m66kfVtoRoBhiliJxKmU1113powTqpymJjvIQ7RGReGyF9wpTnXMepkWGY5pjI4tRJl7mkazTfJ_VFvnAHhHohLCAVY1lsRGI0tCA4zxTOKoGO9YekgcGYvJeGGJMqDkd_vz4m25iOkiV4QurFcu1OoZIX5iyk8BPbd59F
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1NTwIxEG0IHvSkBozf9uDRQpe2u_RoVAIKSCJEbqTdtkBU1sBy8dfb2V3RGA_eNnvZZibpvNl57w1Cl8znlQeGEcMoJZwrSaSHHcT5aqWYEY4JEAr3-mF7xO_HYlxCVxstjLU2I5_ZGjxms3yTxGv4VVaXHr3wpm_Qt3zdF0Gu1ipkvwGV9c7gdvD0LDwCAolVA0hbFCaQP_amZGWjtYt6Xx_M2SIvtXWqa_HHLy_G_55oD1W_BXp4sCk9-6hkFxX0eI1XPuKghcKZDzWeZp7SEHqsXqfJcp7O3nCa4JUCT2AM7E2s8BQuPLwCRig24KNbrMCqolHrbnjTJsW-BDL3KCUlVkbURKppnG_CGpGMA62d8LcIj53SXLiQSseYCiItsnGapTwOIitsbMOmVuwAlRfJwh4i7Dg3HqtoQwPNQ618EwITTW6N5OBZf4QqEIzJe26JMSnicPz36wu03R72upNup_9wgnYgNTln8BSV0-Xanvm6nurzLJ2fWomijg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2020+IEEE+International+Parallel+and+Distributed+Processing+Symposium+Workshops+%28IPDPSW%29&rft.atitle=A+scalable+graph+generation+algorithm+to+sample+over+a+given+shell+distribution&rft.au=Ozkaya%2C+M.+Yusuf&rft.au=Balin%2C+M.+Fatih&rft.au=Pinar%2C+Ali&rft.au=Catalyurek%2C+Umit+V.&rft.date=2020-05-01&rft.pub=IEEE&rft.spage=227&rft.epage=236&rft_id=info:doi/10.1109%2FIPDPSW50202.2020.00051&rft.externalDocID=9150480