Hate Speech Recognition in Chilean Tweets

Tobar-Arancibia, Alfonso; Moreno, Sebastian; Lopatin, Javier

Abstract

Hate speech, which targets specific groups based on race, religion, or sexual orientation, is a growing concern, especially on social media. Detecting hate speech is a critical research area, but most models are developed in English, leaving a gap for other languages like Spanish. Spanish presents additional challenges due to its regional variants and slang. In this paper, we introduce HateStack, the winning model of the 2022 Datathon at Universidad Técnica Federico Santa Maria, Chile, designed to detect hate speech in Chilean tweets. HateStack is a two-level ensemble model comprising a feature extraction process, five Level-l models, and a logistic regression as a second-level model. The results demonstrate that HateStack outperforms other ensemble models and RoBERTuito, a transformer-based deep learning model tailored for hate speech detection on tweets. Developing such models in non-English languages is important to detect hate speech effectively.

Más información

Título según SCOPUS: ID SCOPUS_ID:85178999475 Not found in local SCOPUS DB
Título de la Revista: 2018 37TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC)
Fecha de publicación: 2023
DOI:

10.1109/SCCC59417.2023.10315748

Notas: SCOPUS