Grammar-compressed indexes with logarithmic search time

Claude, Francisco; Pacheco, Alejandro

Abstract

Let a text T[1.n] be the only string generated by a context-free grammar with g (terminal and nonterminal) symbols, and of size G (measured as the sum of the lengths of the right-hand sides of the rules). Such a grammar, called a grammar-compressed representation of T, can be encoded using Glg⁡G bits. We introduce the first grammar-compressed index that uses O(Glg⁡n) bits (precisely, Glg⁡n+(2+ϵ)Glg⁡g for any constant ϵ>0) and can find the occ occurrences of patterns P[1.m] in time O((m2+occ)lg⁡G). We implement the index and demonstrate its practicality in comparison with the state of the art, on highly repetitive text collections.

Más información

Título según WOS: Grammar-compressed indexes with logarithmic search time
Título según SCOPUS: Grammar-compressed indexes with logarithmic search time
Título de la Revista: Journal of Computer and System Sciences
Volumen: 118
Editorial: ACADEMIC PRESS INC
Fecha de publicación: 2021
Página final: 74
Idioma: English
DOI:

10.1016/j.jcss.2020.12.001

Notas: ISI, SCOPUS