Relative Lempel-Ziv with Constant-Time Random Access
Abstract
Relative Lempel-Ziv (RLZ) is a variant of LZ77 that can compress well collections of similar genomes while still allowing fast random access to them. In theory, at the cost of using sublinear extra space, accessing an arbitrary character takes constant time. We show that even in practice this works quite well: e.g., we can compress 36 S. cerevisiae genomes from a total of 464 MB to 11 MB and still support random access to them in under 50 nanoseconds per character, even when the accessed substrings are short. Our theoretical contribution is an optimized representation of RLZ's pointers.
Más información
Título según WOS: | Relative Lempel-Ziv with Constant-Time Random Access |
Título de la Revista: | BIO-INSPIRED SYSTEMS AND APPLICATIONS: FROM ROBOTICS TO AMBIENT INTELLIGENCE, PT II |
Volumen: | 8799 |
Editorial: | SPRINGER INTERNATIONAL PUBLISHING AG |
Fecha de publicación: | 2014 |
Página de inicio: | 13 |
Página final: | 17 |
Notas: | ISI |