Unsupervised training of Bayesian networks for data clustering
Abstract
This paper presents a new approach to the unsupervised training of Bayesian network classifiers. Three models have been analysed: the Chow and Liu (CL) multinets; the tree-augmented naive Bayes; and a new model called the simple Bayesian network classifier, which is more robust in its structure learning. To perform the unsupervised training of these models, the classification maximum likelihood criterion is used. The maximization of this criterion is derived for each model under the classification expectation-maximization (EM) algorithm framework. To test the proposed unsupervised training approach, 10 well-known benchmark datasets have been used to measure their clustering performance. Also, for comparison, the results for the fc-means and the EM algorithm, as well as those obtained when the three Bayesian network classifiers are trained in a supervised way, are analysed. A real-world image processing application is also presented, dealing with clustering of wood board images described by 165 attributes. Results show that the proposed learning method, in general, outperforms traditional clustering algorithms and, in the wood board image application, the CL multinets obtained a 12 per cent increase, on average, in clustering accuracy when compared with the fc-means method and a 7 per cent increase, on average, when compared with the EM algorithm.
Más información
Título según WOS: | Unsupervised training of Bayesian networks for data clustering |
Título según SCOPUS: | Unsupervised training of Bayesian networks for data clustering |
Título de la Revista: | PROCEEDINGS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES |
Volumen: | 465 |
Número: | 2109 |
Editorial: | ROYAL SOC |
Fecha de publicación: | 2009 |
Página de inicio: | 2927 |
Página final: | 2948 |
Idioma: | English |
URL: | http://rspa.royalsocietypublishing.org/cgi/doi/10.1098/rspa.2009.0065 |
DOI: |
10.1098/rspa.2009.0065 |
Notas: | ISI, SCOPUS |