Iterated Straight-Line Programs

Urbina, Cristian

Abstract

We explore an extension to straight-line programs (SLPs) that outperforms, for some text families, the measure delta based on sub-string complexity, a lower bound for most measures and compressors exploiting repetitiveness (which are crucial in areas like Bioinformatics). The extension, called iterated SLPs (ISLPs), allows rules of the form A -> Pi(k2)(i=k1) B-1(ic1) center dot center dot center dot B-t(ict), for which we show how to extract any substring of length., from the represented text T[1.. n], in time O(lambda + log(2) n log log n). This is the first compressed representation for repetitive texts breaking d while, at the same time, supporting direct access to arbitrary text symbols in polylogarithmic time. As a byproduct, we extend Ganardi et al.'s technique to balance any SLP (so it has a derivation tree of logarithmic height) to a wide generalization of SLPs, including ISLPs.

Más información

Título según WOS: Iterated Straight-Line Programs
Título de la Revista: STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2020
Volumen: 14578
Editorial: SPRINGER INTERNATIONAL PUBLISHING AG
Fecha de publicación: 2024
Página de inicio: 66
Página final: 80
DOI:

10.1007/978-3-031-55598-5_5

Notas: ISI