Performance of evaluation metrics for classification in imbalanced data
Abstract
This paper investigates the effectiveness of various metrics for selecting the adequate model for binary classification when data is imbalanced. Through an extensive simulation study involving 12 commonly used metrics of classification, our findings indicate that the Matthews Correlation Coefficient, G-Mean, and Cohen's kappa consistently yield favorable performance. Conversely, the area under the curve and Accuracy metrics demonstrate poor performance across all studied scenarios, while other seven metrics exhibit varying degrees of effectiveness in specific scenarios. Furthermore, we discuss a practical application in the financial area, which confirms the robust performance of these metrics in facilitating model selection among alternative link functions.
Más información
Título según WOS: | ID WOS:001296991300001 Not found in local WOS DB |
Título de la Revista: | COMPUTATIONAL STATISTICS |
Volumen: | 40 |
Número: | 3 |
Editorial: | SPRINGER HEIDELBERG |
Fecha de publicación: | 2025 |
Página de inicio: | 1447 |
Página final: | 1473 |
DOI: |
10.1007/s00180-024-01539-5 |
Notas: | ISI |