Distributing a metric-space search index onto processors

Marin M.; Ferrarotti F.; Gil Costa V.

Keywords: cluster, objects, search, distributions, information, research, structures, space, database, topology, memory, behaviors, knowledge, index, world, set, data, similarity, theory, distributed, query, retrieval, web, processors, wide, approach, solution, Indexing, (of, information), Specific, User, engines, Behavioral, metric

Abstract

This paper studies the problem of distributing a metric-space search index based on compact clustering onto a set of distributed memory processors. The aim is enabling efficient similarity search in large-scale Web search engines. The index data structure is composed of a set of clusters enclosing the database objects and we propose distribution methods based on two different solution approaches. The first one makes use of specific knowledge about the work-load generated by user queries. Here the challenge is how to represent and use such a knowledge into a method capable of producing a cluster distribution leading to high performance. The second one follows a novel direction by completely disregarding user behavior to look instead at the relationships among the index clusters themselves to decide their placement onto processors. Both methods perform efficiently depending on the context and they are generic enough to be applied to different distributed index data structures for metric-space databases. © 2010 IEEE.

Más información

Título de la Revista: Proceedings of the International Conference on Parallel Processing
Editorial: Society of Laparoendoscopic Surgeons
Fecha de publicación: 2010
Página de inicio: 433
Página final: 442
URL: http://www.scopus.com/inward/record.url?eid=2-s2.0-78649583328&partnerID=q2rCbXpz