NetDS: Distributed Search Framework with Hybrid Acceleration Methods
Approximate neighbour nearest search has achieved great success for indexing similar high-dimensional data in distributed search systems. As the scale of data vectors grows, distributed search require large storage, low latency, and high throughput on processing vectors. To achieve this, researchers...
Saved in:
Published in | Proceedings of the ... International Symposium on Parallel and Distributed Processing with Applications (Print) pp. 2203 - 2210 |
---|---|
Main Authors | , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
30.10.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Approximate neighbour nearest search has achieved great success for indexing similar high-dimensional data in distributed search systems. As the scale of data vectors grows, distributed search require large storage, low latency, and high throughput on processing vectors. To achieve this, researchers tend to load balance data with more machines and implement efficient distributed frameworks, but they need to pay huge storage overhead, which leads to inefficient network transmission.To address this gap, we propose NetDS, which exploits the computational capacity of in-network computation and the storage capacity of solid-state drives. NetDS utilizes a multi-level constrained balanced tree to process data vectors and construct multi-level tables. Then, NetDS proposes a heuristic neighbour graph to solve the boundary data problem. NetDS also offloads central tables and graphs into switches to accelerate vector classification. Finally, NetDS designs hybrid storage and query pre-match methods to accelerate the ANNS distributed system. We deploy NetDS on a programmable switch and evaluate it. NetDS completes the data and query processing in a shorter time than other typical distributed frameworks. |
---|---|
ISSN: | 2158-9208 |
DOI: | 10.1109/ISPA63168.2024.00301 |