Increased bit-parallelism for approximate string matching

Hyyro, H; Fredriksson K.; Navarro G.

Abstract

Bit-parallelism permits executing several operations simultaneously over a set of bits or numbers stored in a single computer word. This technique permits searching for the approximate occurrences of a pattern of length m in a text of length n in time O(? m/w? n), where w is the number of bits in the computer word. Although this is asymptotically the optimal speedup over the basic O(mn) time algorithm, it wastes bitparallelism's power in the common case where m is much smaller than w, since w -m bits in the computer words get unused. In this paper we explore different ways to increase the bit-parallelism when the search pattern is short. First, we show how multiple patterns can be packed in a single computer word so as to search for multiple patterns simultaneously. Instead of paying O(rn) time to search for r patterns of length m < w, we obtain O(?r/?w/m?) time. Second, we show how the mechanism permits boosting the search for a single pattern of length m < w, which can be searched for in time O(n/?w/m?) instead of O(n). Finally, we show how to extend these algorithms so that the time bounds essentially depend on k instead of m, where k is the maximum number of differences permitted. Our experimental results show that that the algorithms work well in practice, and are the fastest alternatives for a wide range of search parameters. © Springer-Verlag 2004.

Más información

Título según WOS: Increased bit-parallelism for approximate string matching
Título según SCOPUS: Increased bit-parallelism for approximate string matching
Título de la Revista: STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2020
Volumen: 3059
Editorial: SPRINGER INTERNATIONAL PUBLISHING AG
Fecha de publicación: 2004
Página de inicio: 285
Página final: 298
Idioma: English
Notas: ISI, SCOPUS