TWEEF: Trustworthiness Estimation and Enhancement Framework for Machine Learning Models

Ugalde, Jonathan; Salas, Rodrigo; Torres, Romina; Velandia, Daira; Bariviera, Aurelio F.; Estevez, Pablo A.; Godoy, Maria Paz

Abstract

The rapid adoption of Machine Learning (ML) in high-impact domains has intensified the need for systematic tools to assess and improve the trustworthiness of predictive models beyond conventional performance metrics. This paper presents TWEEF (Trustworthiness Estimation and Enhancement Framework), a modular and extensible framework that operationalizes trustworthiness through the joint evaluation of performance, fairness, and interpretability. TWEEF integrates intuitionistic fuzzy logic and subjective logic to transform quantitative trust-related metrics into linguistic assessments, which are subsequently aggregated using operators such as the Linguistic Weighted Average (LWA), Gaussian Weighted Aggregation (GWA), and Subjective Logic (SL). The framework extends the scikit-learn ecosystem through a meta-estimator, the TrustworthyClassifier, which orchestrates metric computation, bias-mitigation procedures, surrogate-model generation, and trust aggregation within a unified, pipeline-compatible workflow. The framework is empirically evaluated through four experiments on widely used benchmark datasets (German Credit, COMPAS, and Adult) in binary classification settings. Results show that TWEEF consistently reveals fairness and interpretability limitations that may remain hidden when relying solely on predictive performance, and that the resulting trust scores respond coherently to different metric configurations and weighting schemes. These findings indicate that TWEEF provides a structured mechanism for trust assessment and enhancement, while also offering a flexible foundation for future extensions to additional learning tasks and evaluation dimensions.

Más información

Título según WOS: ID WOS:001670132600001 Not found in local WOS DB
Título de la Revista: APPLIED SCIENCES-BASEL
Volumen: 16
Número: 2
Editorial: MDPI
Fecha de publicación: 2026
DOI:

10.3390/app16021077

Notas: ISI