Detecting Hate Speech on Brazilian Social Media: New Dataset and Analysis

Felipe Ramos de Oliveira; Victoria Dias Reis; Nelson Francisco Favilla Ebecken

doi:10.55592/cilamce.v6i06.8208

Autores

Felipe Ramos de Oliveira UFRJ - Universidade Federal do Rio de Janeiro
Victoria Dias Reis UFRJ - Universidade Federal do Rio de Janeiro
Nelson Francisco Favilla Ebecken UFRJ

DOI:

https://doi.org/10.55592/cilamce.v6i06.8208

Palavras-chave:

Dataset, classification, machine learning

Resumo

Social media plays a crucial role in human interaction, facilitating communication and self-expression. However, the proliferation of hate speech on these platforms poses significant risks to individuals and communities. Detecting and addressing hate speech is particularly challenging in languages like Portuguese due to its rich vocabulary, complex grammar, and regional variations. To address this challenge, we introduce TuPy-E, the largest annotated Portuguese corpus dedicated to hate speech detection. Through a comprehensive analysis utilizing advanced techniques such as BERT and GPT-2 models, our research contributes to both academic understanding and practical applications in this field.

Detecting Hate Speech on Brazilian Social Media: New Dataset and Analysis

Autores

DOI:

Palavras-chave:

Resumo

Downloads

Publicado

Edição

Seção