Indexing text with approximate q-grams
Abstract
We present a new index for approximate string matching. The index collects text q-samples, i.e. disjoint text substrings of length q, at fixed intervals and stores their positions. At search time, part of the text is filtered out by noticing that any occurrence of the pattern must be reflected in the presence of some text q-samples that match approximately inside the pattern. We show experimentally that the parameterization mechanism of the related filtration scheme provides a compromise between the space requirement of the index and the error level for which the filtration is still efficient.
Más información
Título según WOS: | Indexing text with approximate q-grams |
Título de la Revista: | STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2020 |
Volumen: | 1848 |
Editorial: | SPRINGER INTERNATIONAL PUBLISHING AG |
Fecha de publicación: | 2000 |
Página de inicio: | 350 |
Página final: | 363 |
Idioma: | English |
Notas: | ISI |