Document retro-conversion for personalized electronic reedition

Belaid, Abdel; Alusse, Andre; Rangoni, Yves; Cecotti, Hubert; Farah, Fady; Gagean, Nicolas; Vigne, Henri

Keywords: document image analysis, neural network, OCR combination

Abstract

In this paper, we propose a generic framework to store, retrieve, transform and present mixed sets of native and virtual documents. We intend to use or to develop specific tools organized in a global architecture, from document analysis and capture, document retrieval and classification-categorization, to full generation of personal sets of documents, corresponding to user's specific needs and profile. The first step concerns document preparation and formal analysis. The second step adds semantic metadata, content indexing, and structure-semantic analysis. The third step helps user for the constitution of personalized documents. Research is based on domain specific large sets of documents, as for example European Union law documents (many millions, many file formats, in twenty official languages).

Más información

Fecha de publicación: 2005
Año de Inicio/Término: 8 March 2005
Página de inicio: 193
Página final: 218
Idioma: English
URL: https://www.kiv.zcu.cz/~dalfia/publications/IWDA2005.pdf