An Attention-Based Architecture for Hierarchical Classification With CNNs
Abstract
Branch Convolutional Neural Nets have become a popular approach for hierarchical classification in computer vision and other areas. Unfortunately, these models often led to hierarchical inconsistency: predictions for the different hierarchy levels do not necessarily respect the class-subclass constraints imposed by the hierarchy. Several architectures to connect the branches have arisen to overcome this limitation. In this paper, we propose a more straightforward and flexible method: let the neural net decide how these branches must be connected. We achieve this by formulating an attention mechanism that dynamically determines how branches influence each other during training and inference. Experiments on image classification benchmarks show that the proposed method can outperform state-of-the-art models in terms of hierarchical performance metrics and consistency. Furthermore, although sometimes we found a slightly lower performance at the deeper level of the hierarchy, the model predicts much more accurately the ground-truth path between a concept and its ancestors in the hierarchy. This result suggests that the model does learn not only local class memberships but also hierarchical dependencies between concepts.
Más información
Título según WOS: | An Attention-Based Architecture for Hierarchical Classification With CNNs |
Título de la Revista: | IEEE ACCESS |
Volumen: | 11 |
Editorial: | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
Fecha de publicación: | 2023 |
Página de inicio: | 32972 |
Página final: | 32995 |
DOI: |
10.1109/ACCESS.2023.3263472 |
Notas: | ISI |