Accent-Aware Deepfake Speech Detection in Brazilian Portuguese: Dataset Construction and Model Evaluation

Autores

  • Sofia Nascimento da Silva
  • Erick Miguel Barbosa dos Santos
  • Katarina Veljovic
  • Karin Komati

Palavras-chave:

Regional Accents, Text-to-Speech, Synthetic Audio

Resumo

The rise of digitally manipulated audio content creates new challenges in verifying information authenticity, especially on social media. Advances in artificial intelligence (AI), particularly in text-to-speech and voice synthesis technologies, have greatly enhanced the quality and realism of generated audio. This study addresses the problem of deepfake audio detection and offers two main contributions. First, it introduces the FakeBrAccent dataset, which includes 746 audio samples (373 real and 373 synthetic) in Brazilian Portuguese, featuring regional accents such as Baiano (Bahia), Fluminense (Rio de Janeiro and Espírito Santo), Sulista (Southern Brazil), Nordestino (Northeastern Brazil), and Carioca (Rio de Janeiro city). The original BrAccent dataset was used as both the source of real samples and as a reference for simulating accents during the generation of synthetic samples with a text-to-speech tool. Second, the study evaluates the performance of two classification models, Convolutional Neural Networks and XGBoost, on this dataset. The models were tested using standard performance metrics, including accuracy, precision, recall, and F1-score. The findings provide a baseline for future research into synthetic speech detection in Brazilian Portuguese, emphasizing the role of accent variation in model performance.

Publicado

2025-12-01

Edição

Seção

Artigos