Distributing a metric-space search index onto processors
Keywords: cluster, objects, search, distributions, information, research, structures, space, database, topology, memory, behaviors, knowledge, index, world, set, data, similarity, theory, distributed, query, retrieval, web, processors, wide, approach, solution, Indexing, (of, information), Specific, User, engines, Behavioral, metric
Abstract
This paper studies the problem of distributing a metric-space search index based on compact clustering onto a set of distributed memory processors. The aim is enabling efficient similarity search in large-scale Web search engines. The index data structure is composed of a set of clusters enclosing the database objects and we propose distribution methods based on two different solution approaches. The first one makes use of specific knowledge about the work-load generated by user queries. Here the challenge is how to represent and use such a knowledge into a method capable of producing a cluster distribution leading to high performance. The second one follows a novel direction by completely disregarding user behavior to look instead at the relationships among the index clusters themselves to decide their placement onto processors. Both methods perform efficiently depending on the context and they are generic enough to be applied to different distributed index data structures for metric-space databases. © 2010 IEEE.
Más información
Título de la Revista: | Proceedings of the International Conference on Parallel Processing |
Editorial: | Society of Laparoendoscopic Surgeons |
Fecha de publicación: | 2010 |
Página de inicio: | 433 |
Página final: | 442 |
URL: | http://www.scopus.com/inward/record.url?eid=2-s2.0-78649583328&partnerID=q2rCbXpz |