Sparseness-optimized feature importance with prior knowledge and reinforcement learning-powered optimization

Napoles, Gonzalo; Grau, Isel; Salgueiro, Yamisleydi

Abstract

Sparseness Optimized Feature Importance (SOFI) is a post-hoc method that produces explanations using minimal feature sets, reducing the cognitive burden on human experts by highlighting only the most critical factors. In practice, explanations take the form of a ranking of features whose cumulative marginalization leads to rapid degradation in model performance. However, SOFI employs hill climbing for ranking optimization, which in creases the risk of convergence to local optima when the number of features grows. In addition, like other mainstream explainers, SOFI lacks a mechanism for exploiting prior knowledge during optimization. In this paper, we propose Sparseness Optimized Feature Importance with Prior Knowledge (SOFI-P), an extension of SOFI that integrates prior knowledge into a reinforcement learning framework to optimize explanation sparsity. In this explainer, the exploration is guided by a probabilistic swapping strategy that maximizes model perfor mance degradation under cumulative feature marginalization. Prior knowledge is incorporated as a learnable parameter vector, initially defined by domain experts and later updated during optimization. In addition, we derive upper bounds on the change in explanation sparsity induced by adjacent and arbitrary swaps in a feature ranking. The proposed theorems provide practical value by establishing concrete limits for expected explanation sparsity post-swapping, thereby characterizing the problem's search space complexity. Empirical evaluation on 40 structured classification datasets shows that SOFI-P produces more sparse explanations than state-of-the-art explainers. Furthermore, ablation studies confirm the benefits of incorporating prior knowledge to guide rein forcement learning, even when such knowledge is imprecise. Toward the end, a case study on chest X-ray images illustrates the practical applicability of the method.

Más información

Título según WOS: ID WOS:001684623900001 Not found in local WOS DB
Título de la Revista: NEUROCOMPUTING
Volumen: 674
Editorial: Elsevier
Fecha de publicación: 2026
DOI:

10.1016/j.neucom.2026.132925

Notas: ISI