Multimodal Emotion Recognition Dataset in the Wild (MERDWild)

Martinez, Facundo; Aguilera, Ana; Mellado, Diego

Abstract

Multimodal emotion recognition involves identifying human emotions in specific situations using artificial intelligence across multiple modalities. MERDWild, a multimodal emotion recognition dataset, addresses the challenge of unifying, cleaning, and transforming three datasets collected in uncontrolled environments with the aim of integrating and standardizing a database that encompasses three modalities: facial images, audio, and text. A methodology is presented that combines information from these modalities, utilizing �in-the-wild� datasets including AFEW, AffWild2, and MELD. MERDWild consists of 15 873 audio samples, 905 281 facial images, and 15 321 sentences, all of them considered usable quality data. The project outlines the entire process of data cleaning, transformation, normalization, and quality control, resulting in a unified structure for recognizing seven emotions. © 2023 IEEE.

Más información

Título según SCOPUS: Multimodal Emotion Recognition Dataset in the Wild (MERDWild)
Título de la Revista: Proceedings - IEEE CHILEAN Conference on Electrical, Electronics Engineering, Information and Communication Technologies, ChileCon
Editorial: Institute of Electrical and Electronics Engineers Inc.
Fecha de publicación: 2023
Idioma: Spanish
DOI:

10.1109/CHILECON60335.2023.10418672

Notas: SCOPUS