NetDS: Distributed Search Framework with Hybrid Acceleration Methods

Approximate neighbour nearest search has achieved great success for indexing similar high-dimensional data in distributed search systems. As the scale of data vectors grows, distributed search require large storage, low latency, and high throughput on processing vectors. To achieve this, researchers...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the ... International Symposium on Parallel and Distributed Processing with Applications (Print) pp. 2203 - 2210
Main Authors Zhang, Penghao, Hu, Zhiguo
Format Conference Proceeding
LanguageEnglish
Published IEEE 30.10.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Approximate neighbour nearest search has achieved great success for indexing similar high-dimensional data in distributed search systems. As the scale of data vectors grows, distributed search require large storage, low latency, and high throughput on processing vectors. To achieve this, researchers tend to load balance data with more machines and implement efficient distributed frameworks, but they need to pay huge storage overhead, which leads to inefficient network transmission.To address this gap, we propose NetDS, which exploits the computational capacity of in-network computation and the storage capacity of solid-state drives. NetDS utilizes a multi-level constrained balanced tree to process data vectors and construct multi-level tables. Then, NetDS proposes a heuristic neighbour graph to solve the boundary data problem. NetDS also offloads central tables and graphs into switches to accelerate vector classification. Finally, NetDS designs hybrid storage and query pre-match methods to accelerate the ANNS distributed system. We deploy NetDS on a programmable switch and evaluate it. NetDS completes the data and query processing in a shorter time than other typical distributed frameworks.
ISSN:2158-9208
DOI:10.1109/ISPA63168.2024.00301