PDI - Resultado de Búsqueda

Article ISI SCOPUS
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2023)

Enhancing Intra-modal Similarity in a Cross-Modal Triplet Loss

Mallea, Mario; Nanculef, Ricardo ; Araya, Mauricio; Bifet, A; Lorena, AC; Ribeiro R.P.; Gama, J; Abreu, PH

Abstract

Cross-modal retrieval requires building a common latent space that captures and correlates information from different data modalities, usually images and texts. Cross-modal training based on the triplet loss with hard negative mining is a state-of-the-art technique to address this problem. This paper shows that such approach is not always effective in handling intra-modal similarities. Specifically, we found that this method can lead to inconsistent similarity orderings in the latent space, where intra-modal pairs with unknown ground-truth similarity are ranked higher than cross-modal pairs representing the same concept. To address this problem, we propose two novel loss functions that leverage intra-modal similarity constraints available in a training triplet but not used by the original formulation. Additionally, this paper explores the application of this framework to unsupervised image retrieval problems, where cross-modal training can provide the supervisory signals that are otherwise missing in the absence of category labels. Up to our knowledge, we are the first to evaluate cross-modal training for intra-modal retrieval without labels. We present comprehensive experiments on MS-COCO and Flickr30K, demonstrating the advantages and limitations of the proposed methods in cross-modal and intra-modal retrieval tasks in terms of performance and novelty measures. Our code is publicly available on GitHub https://github.com/MariodotR/FullHN.git.

Más información

Título según WOS:	ID WOS:001455440700017 Not found in local WOS DB
Título según SCOPUS:	Enhancing Intra-modal Similarity inÂ aÂ Cross-Modal Triplet Loss
Título de la Revista:	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen:	14276
Editorial:	Springer Science and Business Media Deutschland GmbH
Fecha de publicación:	2023
Página de inicio:	249
Página final:	264
Idioma:	English
DOI:	10.1007/978-3-031-45275-8_17
Notas:	ISI, SCOPUS