LZ77-like compression with fast random access

Kreft S.; Navarro G.

Keywords: sequence, compression, access, database, sequences, time, scheme, text, end, data, force, individual, parsing, Random, Indexing, Optimal

Abstract

We introduce an alternative Lempel-Ziv text parsing, LZ-End, that converges to the entropy and in practice gets very close to LZ77. LZ-End forces sources to finish at the end of a previous phrase. Most Lempel-Ziv parsings can decompress the text only from the beginning. LZ-End is the only parsing we know of able of decompressing arbitrary phrases in optimal time, while staying closely competitive with LZ77, especially on highly repetitive collections, where LZ77 excells. Thus LZ-End is ideal as a compression format for highly repetitive sequence databases, where access to individual sequences is required, and it also opens the door to compressed indexing schemes for such collections. © 2010 IEEE.

Más información

Título de la Revista: Data Compression Conference Proceedings
Editorial: Institute of Electrical and Electronics Engineers Inc.
Fecha de publicación: 2010
Página de inicio: 239
Página final: 248
URL: http://www.scopus.com/inward/record.url?eid=2-s2.0-77952730117&partnerID=q2rCbXpz