A signal denoising method for text meaning vectors

Hernandez S.; Sallis P.; Garden K.

Keywords: models, modeling, simulation, learning, stream, matrix, entropy, algorithms, media, text, signal, computer, data, noise, social, mining, parameter, analysis, elimination, machine, techniques, methods, method, denoising, processing, topic, mathematical, branch-and-bound, Initial, Very, Computational, Large, datum, Parameterized, Inter-dependencies, Sentiment

Abstract

The extraction of meaning or at least inter-dependencies using data and text mining methods is well understood. Numerous approaches have been taken to select relevant information from often very large data sets. The discarding of items that are not relevant to a parameterized retrieval is usually based on an 'include or do not include' decision imbedded in some kind of branch-and-bound algorithm, made to a varying extent sophisticated by the use of machine learning techniques. This paper addresses the discarding process as noise elimination within the context of well-established signal processing methods. It proposes an entropy-based approach using a value-weighted matrix for word relevance matching, where whole text is partitioned according to whether there is a direct relevance of word pairs to the declared meaning being sought, which is expressed as a set of parameters and the noise is considered as errors in the data stream. The resulting non-noisy data is depicted as a text meaning vector, where terms of direct relevance to the initial parameter values are stored. © 2011 IEEE.

Más información

Título de la Revista: 1604-2004: SUPERNOVAE AS COSMOLOGICAL LIGHTHOUSES
Editorial: ASTRONOMICAL SOC PACIFIC
Fecha de publicación: 2011
Página de inicio: 23
Página final: 25
URL: http://www.scopus.com/inward/record.url?eid=2-s2.0-80052341695&partnerID=q2rCbXpz