Grammar-compressed indexes with logarithmic search time
Abstract
Let a text T[1..n] be the only string generated by a context-free grammar with g (terminal and nonterminal) symbols, and of size G (measured as the sum of the lengths of the righthand sides of the rules). Such a grammar, called a grammar-compressed representation of T, can be encoded using G lg G bits. We introduce the first grammar-compressed index that uses O(G lg n) bits (precisely, G lg n + (2 + E)G lg g for any constant epsilon > 0) and can find the occ occurrences of patterns P[1..m] in time O ((m(2) + occ) lg G). We implement the index and demonstrate its practicality in comparison with the state of the art, on highly repetitive text collections. (c) 2020 Elsevier Inc. All rights reserved.
Más información
| Título según WOS: | Grammar-compressed indexes with logarithmic search time |
| Título de la Revista: | JOURNAL OF COMPUTER AND SYSTEM SCIENCES |
| Volumen: | 118 |
| Editorial: | ACADEMIC PRESS INC ELSEVIER SCIENCE |
| Fecha de publicación: | 2021 |
| Página de inicio: | 53 |
| Página final: | 74 |
| DOI: |
10.1016/j.jcss.2020.12.001 |
| Notas: | ISI |