Boyer-Moore string matching over Ziv-Lempel compressed text
Abstract
We present a Boyer-Moore approach to string matching over LZ78 and LZW compressed text. The key idea is that, despite that we cannot exactly choose which text characters to inspect, we can still use the characters explicitly represented in those formats to shift the pattern in the text. We present a basic approach and more advanced ones. Despite that the theoretical average complexity does not improve because still all the symbols in the compressed text have to be scanned, we show experimentally that speedups of up to 30% over the fastest previous approaches are obtained. Moreover, we show that using an encoding method that sacrifices some compression ratio our method is twice as fast as decompressing plus searching using the best available algorithms.
Más información
Título según WOS: | Boyer-Moore string matching over Ziv-Lempel compressed text |
Título de la Revista: | STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2020 |
Volumen: | 1848 |
Editorial: | SPRINGER INTERNATIONAL PUBLISHING AG |
Fecha de publicación: | 2000 |
Página de inicio: | 166 |
Página final: | 180 |
Idioma: | English |
Notas: | ISI |