An unsupervised approach for traffic trace sanitization based on the entropy spaces
Keywords: principal component analysis, data mining, k-means clustering, sanitizing traffic data, entropy-based anomaly detection
Abstract
The accuracy and reliability of an anomaly-based network intrusion detection system are dependent on the quality of data used to build a normal behavior profile. However, obtaining these datasets is not trivial due to privacy, obsolescence, and suitability issues. This paper presents an approach to traffic trace sanitization based on the identification of anomalous patterns in a three-dimensional entropy space of the flow traffic data captured from a campus network. Anomaly-free datasets are generated by filtering out attacks and traffic pieces that modify the typical position of centroids in the entropy space. Our analyses were performed on real life traffic traces and show that the sanitized datasets have homogeneity and consistency in terms of cluster centroids and probability distributions of the PCA-transformed entropy space.
Más información
Título de la Revista: | TELECOMMUNICATION SYSTEMS |
Volumen: | 61 |
Editorial: | SPRINGER-VERLAG BERLIN |
Fecha de publicación: | 2015 |
Página de inicio: | 609 |
Página final: | 626 |
Idioma: | English |
URL: | https://link.springer.com/article/10.1007/s11235-015-0017-6 |
Notas: | WOS Core Collection |