Load balancing distributed inverted files: Query ranking

Gomez-Pantoja, C; Marin M.

Keywords: systems, search, solutions, information, organization, structures, networks, time, index, algorithms, world, services, speed, file, queries, computer, preferences, data, scheduling, load, software, retrieval, web, telecommunication, wide, ups, Functions, User, Inverted, engines, Running, Boolean, balancing, files, (IF)

Abstract

Search engines use inverted files as index data structures to speed up the solution of user queries. The index is distributed on a set of processors forming a cluster of computers and queries are received by a broker machine and scheduled for solution in the cluster. The broker must use a scheduling algorithm to assign queries to processors since the computations associated with the ranking of documents that form part of the solutions to queries can take a significant fraction of the total running time. The cost of this task can be highly variable and depends on the particular user preferences for words when formulating queries in a given period of time. Thus the scheduling algorithm must be able to cope efficiently with a highly dynamic and very large amount of jobs being assigned in an on-line manner to the processors. In this paper we evaluate a number of scheduling algorithms proposed in the literature in the context of scheduling queries on a search engine. © 2008 IEEE.

Más información

Título de la Revista: 1604-2004: SUPERNOVAE AS COSMOLOGICAL LIGHTHOUSES
Editorial: ASTRONOMICAL SOC PACIFIC
Fecha de publicación: 2008
Página de inicio: 329
Página final: 333
URL: http://www.scopus.com/inward/record.url?eid=2-s2.0-47349083068&partnerID=q2rCbXpz