An unsupervised approach for traffic trace sanitization based on the entropy spaces

Velarde-Alvarado, P.; Vargas-Rosales, C.; Martínez-Peláez, R.; Toral-Cruz, H.; Martínez-Herrera, A.F.

Keywords: principal component analysis, data mining, k-means clustering, sanitizing traffic data, entropy-based anomaly detection

Abstract

The accuracy and reliability of an anomaly-based network intrusion detection system are dependent on the quality of data used to build a normal behavior profile. However, obtaining these datasets is not trivial due to privacy, obsolescence, and suitability issues. This paper presents an approach to traffic trace sanitization based on the identification of anomalous patterns in a three-dimensional entropy space of the flow traffic data captured from a campus network. Anomaly-free datasets are generated by filtering out attacks and traffic pieces that modify the typical position of centroids in the entropy space. Our analyses were performed on real life traffic traces and show that the sanitized datasets have homogeneity and consistency in terms of cluster centroids and probability distributions of the PCA-transformed entropy space.

Más información

Título de la Revista: TELECOMMUNICATION SYSTEMS
Volumen: 61
Editorial: SPRINGER-VERLAG BERLIN
Fecha de publicación: 2015
Página de inicio: 609
Página final: 626
Idioma: English
URL: https://link.springer.com/article/10.1007/s11235-015-0017-6
Notas: WOS Core Collection