PDI - Resultado de Búsqueda

Abstract

Two decades ago, a breakthrough in indexing string collections made it possible to represent them within their compressed space while at the same time offering indexed search functionalities. As this new technology permeated through applications like bioinformatics, the string collections experienced a growth that outperforms Moore's Law and challenges our ability to handle them even in compressed form. It turns out, fortunately, that many of these rapidly growing string collections are highly repetitive, so that their information content is orders of magnitude lower than their plain size. The statistical compression methods used for classical collections, however, are blind to this repetitiveness, and therefore a new set of techniques has been developed to properly exploit it. The resulting indexes form a new generation of data structures able to handle the huge repetitive string collections that we are facing. In this survey, formed by two parts, we cover the algorithmic developments that have led to these data structures.

Más información

Título según WOS:	Indexing Highly Repetitive String Collections, Part I: Repetitiveness Measures
Título de la Revista:	ACM COMPUTING SURVEYS
Volumen:	54
Número:	2
Editorial:	ASSOC COMPUTING MACHINERY
Fecha de publicación:	2021
DOI:	10.1145/3434399
Notas:	ISI

Indexing Highly Repetitive String Collections, Part I: Repetitiveness Measures

Abstract

Más información