Latent semantic analysis and keyword extraction for phishing classification

L'Huillier G.; Hevia, A; Weber R.; Ríos S

Keywords: information, learning, safety, classification, sets, extraction, planning, algorithms, characterisation, science, text, data, semantics, social, mining, analysis, machine, latent, electronic, processing, engineering, Strategic, feature, mail, Semantic, Phishing, Keyword

Abstract

Phishing email fraud has been considered as one of the main cyber-threats over the last years. Its development has been closely related to social engineering techniques, where different fraud strategies are used to deceit a natïve email user. In this work, a latent semantic analysis and text mining methodology is proposed for the characterisation of such strategies, and further classification using supervised learning algorithms. Results obtained showed that the feature set obtained in this work is competitive against previous phishing feature extraction methodologies, achieving promising results over different benchmark machine learning classification techniques. © 2010 IEEE.

Más información

Título de la Revista: 1604-2004: SUPERNOVAE AS COSMOLOGICAL LIGHTHOUSES
Editorial: ASTRONOMICAL SOC PACIFIC
Fecha de publicación: 2010
Página de inicio: 129
Página final: 131
URL: http://www.scopus.com/inward/record.url?eid=2-s2.0-77954766719&partnerID=q2rCbXpz