Iterated Straight-Line Programs

Urbina, Cristian

Abstract

We explore an extension to straight-line programs (SLPs) that outperforms, for some text families, the measure δ based on substring complexity, a lower bound for most measures and compressors exploiting repetitiveness (which are crucial in areas like Bioinformatics). The extension, called iterated SLPs (ISLPs), allows rules of the form A→Πi=k1k2B1ic1⋯Btict, for which we show how to extract any substring of length λ, from the represented text T[1..n], in time O(λ+log2nloglogn). This is the first compressed representation for repetitive texts breaking δ while, at the same time, supporting direct access to arbitrary text symbols in polylogarithmic time. As a byproduct, we extend Ganardi et al.’s technique to balance any SLP (so it has a derivation tree of logarithmic height) to a wide generalization of SLPs, including ISLPs.

Más información

Título según WOS: Iterated Straight-Line Programs
Título según SCOPUS: Iterated Straight-Line Programs
Título de la Revista: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen: 14578
Editorial: Springer Science and Business Media Deutschland GmbH
Fecha de publicación: 2024
Página de inicio: 66
Página final: 80
Idioma: English
DOI:

10.1007/978-3-031-55598-5_5

Notas: ISI, SCOPUS