Grammar-compressed indexes with logarithmic search time
Abstract
Let a text T[1..n] be the only string generated by a context-free grammar with g (terminal and nonterminal) symbols, and of size G (measured as the sum of the lengths of the righthand sides of the rules). Such a grammar, called a grammar-compressed representation of T, can be encoded using G lg G bits. We introduce the first grammar-compressed index that uses O(G lg n) bits (precisely, G lg n + (2 + E)G lg g for any constant epsilon > 0) and can find the occ occurrences of patterns P[1..m] in time O ((m(2) + occ) lg G). We implement the index and demonstrate its practicality in comparison with the state of the art, on highly repetitive text collections. (c) 2020 Elsevier Inc. All rights reserved.
Más información
Título según WOS: | Grammar-compressed indexes with logarithmic search time |
Título de la Revista: | JOURNAL OF COMPUTER AND SYSTEM SCIENCES |
Volumen: | 118 |
Editorial: | ACADEMIC PRESS INC ELSEVIER SCIENCE |
Fecha de publicación: | 2021 |
Página de inicio: | 53 |
Página final: | 74 |
DOI: |
10.1016/j.jcss.2020.12.001 |
Notas: | ISI |