LZ77-like compression with fast random access
Keywords: sequence, compression, access, database, sequences, time, scheme, text, end, data, force, individual, parsing, Random, Indexing, Optimal
Abstract
We introduce an alternative Lempel-Ziv text parsing, LZ-End, that converges to the entropy and in practice gets very close to LZ77. LZ-End forces sources to finish at the end of a previous phrase. Most Lempel-Ziv parsings can decompress the text only from the beginning. LZ-End is the only parsing we know of able of decompressing arbitrary phrases in optimal time, while staying closely competitive with LZ77, especially on highly repetitive collections, where LZ77 excells. Thus LZ-End is ideal as a compression format for highly repetitive sequence databases, where access to individual sequences is required, and it also opens the door to compressed indexing schemes for such collections. © 2010 IEEE.
Más información
Título de la Revista: | Data Compression Conference Proceedings |
Editorial: | Institute of Electrical and Electronics Engineers Inc. |
Fecha de publicación: | 2010 |
Página de inicio: | 239 |
Página final: | 248 |
URL: | http://www.scopus.com/inward/record.url?eid=2-s2.0-77952730117&partnerID=q2rCbXpz |