The Heuristic Algorithm For Symmetric Horizontal Data Distribution

The article considers one algorithm for the optimal distribution of "objects" of an arbitrary nature among "storages", the essence of which is determined by the subject area. Some subject areas for which the optimal distribution problem is relevant are considered. Authors conside...

Full description

Saved in:
Bibliographic Details
Published inIEEE NW Russia Young Researchers in Electrical and Electronic Engineering Conference pp. 2161 - 2165
Main Authors Munerman, Victor, Munerman, Daniel, Samoilova, Tatyana
Format Conference Proceeding
LanguageEnglish
Published IEEE 26.01.2021
Subjects
Online AccessGet full text
ISSN2376-6565
DOI10.1109/ElConRus51938.2021.9396510

Cover

Loading…
Abstract The article considers one algorithm for the optimal distribution of "objects" of an arbitrary nature among "storages", the essence of which is determined by the subject area. Some subject areas for which the optimal distribution problem is relevant are considered. Authors considers the problem of accelerating of the Join operation is considered. In the case of big data parallel processing, the Join operation requires uniform distribution of data between the cluster processors. In this case, parallel implementation of the Join operation will be effective only when the computational complexities of its execution in all database fragments will be minimally different from each other. The optimality criterion should ensure uniform distribution of data. A detailed description of the heuristic optimal distribution algorithm is given. Objective functions for the problems under consideration are proposed. A description is given of the experiments that made it possible to assess the quality of the heuristic greedy optimal distribution algorithm. As a result of these experiments, the dependences of the execution time of the algorithm on the number of distributed objects and the quality of distribution (the difference between the maximum and minimum storage capacity) on the number of stores and the interval of the values of the objects weight. It is shown that the algorithm is quite simple and can be easily implemented in any programming language. The running time of the algorithm, even for big data, is small, which allows it to be effectively used in the preparation of data for parallel solving problems with high computational complexity. The algorithm shows good results when distributing ables-operands across data warehouses. The largest storage capacity differs from the smallest by a small amount.
AbstractList The article considers one algorithm for the optimal distribution of "objects" of an arbitrary nature among "storages", the essence of which is determined by the subject area. Some subject areas for which the optimal distribution problem is relevant are considered. Authors considers the problem of accelerating of the Join operation is considered. In the case of big data parallel processing, the Join operation requires uniform distribution of data between the cluster processors. In this case, parallel implementation of the Join operation will be effective only when the computational complexities of its execution in all database fragments will be minimally different from each other. The optimality criterion should ensure uniform distribution of data. A detailed description of the heuristic optimal distribution algorithm is given. Objective functions for the problems under consideration are proposed. A description is given of the experiments that made it possible to assess the quality of the heuristic greedy optimal distribution algorithm. As a result of these experiments, the dependences of the execution time of the algorithm on the number of distributed objects and the quality of distribution (the difference between the maximum and minimum storage capacity) on the number of stores and the interval of the values of the objects weight. It is shown that the algorithm is quite simple and can be easily implemented in any programming language. The running time of the algorithm, even for big data, is small, which allows it to be effectively used in the preparation of data for parallel solving problems with high computational complexity. The algorithm shows good results when distributing ables-operands across data warehouses. The largest storage capacity differs from the smallest by a small amount.
Author Munerman, Daniel
Munerman, Victor
Samoilova, Tatyana
Author_xml – sequence: 1
  givenname: Victor
  surname: Munerman
  fullname: Munerman, Victor
  email: vimoon@gmail.com
  organization: Smolensk State University (SmolGU),Physics & Mathematics Dept.,Smolensk,Russia
– sequence: 2
  givenname: Daniel
  surname: Munerman
  fullname: Munerman, Daniel
  email: danvmoon@gmail.com
  organization: Smolensk State University (SmolGU),Physics & Mathematics Dept.,Smolensk,Russia
– sequence: 3
  givenname: Tatyana
  surname: Samoilova
  fullname: Samoilova, Tatyana
  email: tatsamoilova24@gmail.com
  organization: Smolensk State University (SmolGU),Physics & Mathematics Dept.,Smolensk,Russia
BookMark eNotj01Lw0AYhFdRsK39BV6C98R3v7PHmrZGKAhaz2V3u7ErSVY2m0P99QbsYRh4ZhiYObrpQ-8QesRQYAzqadNWoX8fB44VLQsCBBeKKsExXKE5FoIzYFLANZoRKkUuuOB3aDkM3wBACFaSlDP0vD-5rHZj9EPyNlu1XyH6dOqybYjZx7nrXIoTryf6G_qk22ytk87WUz16MyYf-nt02-h2cMuLL9DndrOv6nz39vJarXa5J0BTbo1rpGFg2CQlgZbESsGlbUBSaoA7RdSRH4FLV8oG6ymdnjFpLTONK-kCPfzveufc4Sf6Tsfz4XKZ_gGR2k7b
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ElConRus51938.2021.9396510
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 1665404760
9781665404761
EISSN 2376-6565
EndPage 2165
ExternalDocumentID 9396510
Genre orig-research
GroupedDBID 6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
ID FETCH-LOGICAL-i203t-cbef7b40b440b970382c7657cf0733b05e929d5d057e87f1ac7619347cc4bfe83
IEDL.DBID RIE
IngestDate Tue May 06 03:46:12 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-cbef7b40b440b970382c7657cf0733b05e929d5d057e87f1ac7619347cc4bfe83
PageCount 5
ParticipantIDs ieee_primary_9396510
PublicationCentury 2000
PublicationDate 2021-Jan.-26
PublicationDateYYYYMMDD 2021-01-26
PublicationDate_xml – month: 01
  year: 2021
  text: 2021-Jan.-26
  day: 26
PublicationDecade 2020
PublicationTitle IEEE NW Russia Young Researchers in Electrical and Electronic Engineering Conference
PublicationTitleAbbrev ElConRus
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0002219728
Score 1.7568426
Snippet The article considers one algorithm for the optimal distribution of "objects" of an arbitrary nature among "storages", the essence of which is determined by...
SourceID ieee
SourceType Publisher
StartPage 2161
SubjectTerms Big Data
Computational complexity
Computer languages
Data warehouses
heuristic algorithm
Heuristic algorithms
Linear programming
optimal distribution
parallel programming
Program processors
Title The Heuristic Algorithm For Symmetric Horizontal Data Distribution
URI https://ieeexplore.ieee.org/document/9396510
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwELVKT3BhaRG7fOBI0iyO4xyhiyKkIgRU6q2K7Qkg2gSV5EC_Hk8SyiIOHCJFPkTO2NGMX-a9R8g5c2XkOCq1eKCYxVzBrcRPUkt7TEAAPucaAf3xDY8n7HoaTFvkYs2FAYCq-QxsvK3-5etclQiV9SI_4gHyqTbMwa3maq3xFM9DAy3R6Iq6TtQbzvt5dle-YY2CTVyeazcP-OGkUiWS0TYZf06h7h95sctC2mr1S53xv3PcId0vyh69XSejXdKCbI9sfVMb7JArsyVoDGWtzUwv54_58rl4WtBRvqT374sFmmspGpvRVY4kSTpIioQOUFq3ccXqkslo-NCPrcZCwXr2HL-wlIQ0lMyRzFyR-bqFp0IehCpFs0bpBGDKIx1oU7WBCFM3UQhr-CxUiskUhL9P2lmewQGhEYhEKvNyKHhvDinC0xqlJlOdgAm9OiQdjMbstVbJmDWBOPp7-Jhs4oogmOHxE9IuliWcmvReyLNqXT8Alvak1g
linkProvider IEEE
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3JTsMwELVQOQAXlhax4wNHkmZxHOcIXRSgrRC0Um9VvAQq2gSV5EC_Hk8SyiIOHCJZPkTOONZMXua9h9AFsXlgWSI2qCeIQWxGjciNYkM6hClPuZRKAPT7AxqOyO3YG6-hyxUXRilVNJ8pE4bFv3yZihygsmbgBtQDPtW6zvskKNlaK0TFccBCi1XKorYVNDuzVpo85G9QpUAbl2Ob1S1-eKkUqaS7jfqfiyg7SF7MPOOmWP7SZ_zvKndQ44u0h-9X6WgXralkD2190xuso2v9UuBQ5aU6M76aPaWLafY8x910gR_f53Ow1xI41LPLFGiSuB1lEW6DuG7li9VAo25n2AqNykTBmDqWmxmCq9jnxOJEX4E-38wRPvV8EYNdI7c8pQsk6Uldtynmx3YkANhwiS8E4bFi7j6qJWmiDhAOFIu40A8Hkvf6M4U5UoLYZCwjpUMvDlEdojF5LXUyJlUgjv6ePkcb4bDfm_RuBnfHaBN2B6ANh56gWrbI1alO9hk_K_b4A28RqCY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=IEEE+NW+Russia+Young+Researchers+in+Electrical+and+Electronic+Engineering+Conference&rft.atitle=The+Heuristic+Algorithm+For+Symmetric+Horizontal+Data+Distribution&rft.au=Munerman%2C+Victor&rft.au=Munerman%2C+Daniel&rft.au=Samoilova%2C+Tatyana&rft.date=2021-01-26&rft.pub=IEEE&rft.eissn=2376-6565&rft.spage=2161&rft.epage=2165&rft_id=info:doi/10.1109%2FElConRus51938.2021.9396510&rft.externalDocID=9396510