An efficient algorithm for approximated self-similarity joins in metric spaces
Abstract
Similarity join is a key operation in metric databases. It retrieves all pairs of elements that are similar. Solving such a problem usually requires comparing every pair of objects of the datasets, even when indexing and ad hoc algorithms are used. We propose a simple and efficient algorithm for the computation of the approximated k nearest neighbor self-similarity join. This algorithm computes Theta(n(3/2)) distances and it is empirically shown that it reaches an empirical precision of 46% in real-world datasets. We provide a comparison to other common techniques such as Quickjoin and Locality-Sensitive Hashing and argue that our proposal has a better execution time and average precision. (C) 2020 Elsevier Ltd. All rights reserved.
Más información
| Título según WOS: | An efficient algorithm for approximated self-similarity joins in metric spaces |
| Título según SCOPUS: | An efficient algorithm for approximated self-similarity joins in metric spaces |
| Título de la Revista: | INFORMATION SYSTEMS |
| Volumen: | 91 |
| Editorial: | PERGAMON-ELSEVIER SCIENCE LTD |
| Fecha de publicación: | 2020 |
| Idioma: | English |
| DOI: |
10.1016/J.IS.2020.101510 |
| Notas: | ISI, SCOPUS |