Fast kNN query processing over a multi-node GPU environment
Abstract
The kNN (k nearest-neighbors) search is currently applied in a wide range of applications, such as data mining, multimedia, information retrieval, machine learning, pattern recognition, among others. Most of the solutions for this type of search are restricted to metric spaces or limited to use low dimension data. Our proposed algorithm uses as input a set of values (or measures) and returns the K lowest values from that set and can be used with measures obtained from metric and non-metric spaces or also from high dimensional databases. In this work, we introduce a novel GPU-based exhaustive algorithm to solve kNN queries, which is composed of two steps. The first is based on pivots to reduce the range of search, and the second one uses a set of heaps as auxiliary structures to return the final results. We also extended our algorithm to be able to use a multi-GPU platform and a multi-node/multi-GPU platform. To the best of our knowledge, taking account of the state-of-the-art technical literature, this work uses the most extensive database (in terms of data amount) to process a kNN query using up to 13,189 million of elements and achieving a speed-up up to 1843x when using a 5-nodes/20-GPUs platform.
Más información
Título según WOS: | Fast kNN query processing over a multi-node GPU environment |
Título de la Revista: | JOURNAL OF SUPERCOMPUTING |
Volumen: | 78 |
Número: | 2 |
Editorial: | Springer |
Fecha de publicación: | 2022 |
Página de inicio: | 3045 |
Página final: | 3071 |
DOI: |
10.1007/S11227-021-03975-2 |
Notas: | ISI |