Hybrid hash join process

A hybrid hash join process joins data rows from two tables which have at least one common data column by partitioning the data rows based on the values in the common data column(s), creating data structures to decrease search time for matching rows, and recovering full data buffers using a unique bu...

Full description

Saved in:
Bibliographic Details
Main Authors PLASEK JAMES M, BAERENWALD LOWELL L, LIU LEEIN HSU, JENDE MICHAEL S, YONEDA KENJI
Format Patent
LanguageEnglish
Published 17.07.2001
Edition7
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A hybrid hash join process joins data rows from two tables which have at least one common data column by partitioning the data rows based on the values in the common data column(s), creating data structures to decrease search time for matching rows, and recovering full data buffers using a unique buffer management methodology. A smaller one of the two tables is designated as an outer table and a larger one of the two tables as an inner table. The hybrid hash join process determines which rows in the inner and outer tables satisfy a selection criteria; the rows that satisfy the selection criteria are referred to as inner hit rows and outer hit rows. The hybrid hash join process assigns the inner and outer hit rows to corresponding inner and outer partitions, respectively. Buffer overflow in the outer partitions is handled by linking empty buffers to the outer partitions until all buffers are used. Buffer space is recovered by writing outer hit rows for a selected partition to mass storage. Buffer overflow in the inner partitions is handled by either writing the inner rows in the buffer to mass storage or by searching the corresponding outer partition for matches. The outer hit rows in the corresponding outer partition are allocated to entries in a data structure which is then probed to find matches for the inner hit rows in the buffer. Matching outer hit rows for any inner hit rows written to mass storage are found by repeatedly reading into memory inner and outer hit rows and probing a data structure created from the memory-resident outer hit rows in a partition. Optionally, binary trees are built from the entries in the data structure to speed up the probing procedure. The hybrid hash join process uses at least one hashing algorithm to assign hit rows to partitions, to allocate rows to entries in the data structures, and to probe the data structures for matches.
Bibliography:Application Number: US19980135313