Approximate string matching over Ziv-Lempel compressed text
Abstract
We present a solution to the problem of performing approximate pattern matching on compressed text. The format we choose is the Ziv-Lempel family, specifically the LZ78 and LZW variants. Given a text of length u compressed into length n, and a pattern of length m, we report all the R occurrences of the pattern in the text allowing up to k insertions, deletions and substitutions, in O(mkn + R) time. The existence problem needs O(mkn) time. We also show that the algorithm can be adapted to run in O(k(2)n + min(mkn, m(2)(m sigma)(k)) + R) average time, where sigma is the alphabet size. The experimental results show a speedup over the basic approach for moderate m and small Ic.
Más información
| Título según WOS: | Approximate string matching over Ziv-Lempel compressed text |
| Título de la Revista: | GAMES AND LEARNING ALLIANCE, GALA 2024 |
| Volumen: | 1848 |
| Editorial: | SPRINGER INTERNATIONAL PUBLISHING AG |
| Fecha de publicación: | 2000 |
| Página de inicio: | 195 |
| Página final: | 209 |
| Idioma: | English |
| Notas: | ISI |