dc.contributor.author |
Srikissoon, Trishanta
|
|
dc.contributor.author |
Marivate, Vukosi
|
|
dc.date.accessioned |
2024-05-30T11:03:48Z |
|
dc.date.available |
2024-05-30T11:03:48Z |
|
dc.date.issued |
2023 |
|
dc.description.abstract |
Automated hate speech detection is important to protecting people’s dignity, online experiences, and physical safety in Society 5.0. Transformers are sophisticated pre-trained language models that can be fine-tuned for multilingual hate speech detection. Many studies consider this application as a binary classification problem. Additionally, research on topical hate speech detection use target-specific datasets containing assertions about a particular group. In this paper we investigate multi-class hate speech detection using target-generic datasets. We assess the performance of mBERT and XLM-RoBERTA on high and low resource languages, with limited sample sizes and class imbalance. We find that our fine-tuned mBERT models are performant in detecting gender-targeted hate speech. Our Urdu classifier produces a 31% lift on the baseline model. We also present a pipeline for processing multilingual datasets for multi-class hate speech detection. Our approach could be used in future works on topically focused hate speech detection for other low resource languages, particularly African languages which remain under-explored in this domain. |
en_US |
dc.description.department |
Computer Science |
en_US |
dc.description.librarian |
am2024 |
en_US |
dc.description.sdg |
SDG-09: Industry, innovation and infrastructure |
en_US |
dc.description.sponsorship |
The ABSA Chair of Data Science, the TensorFlow Award for Machine Learning Grant. |
en_US |
dc.description.uri |
https://easychair.org/publications/EPiC/Computing |
en_US |
dc.identifier.citation |
Srikissoon, T. & Marivate, V. 2023, 'Combating hate : how multilingual transformers can help detect topical hate speech', EPiC SeriesinComputing, vol. 93, pp. 203-215. DOI:10.29007/1cm6. |
en_US |
dc.identifier.issn |
2398-7340 (online) |
|
dc.identifier.other |
10.29007/1cm6 |
|
dc.identifier.uri |
http://hdl.handle.net/2263/96304 |
|
dc.language.iso |
en |
en_US |
dc.publisher |
Easychair |
en_US |
dc.rights |
© 2023 EasyChair. |
en_US |
dc.subject |
Hate speech |
en_US |
dc.subject |
Machine learning |
en_US |
dc.subject |
Natural language processing |
en_US |
dc.subject |
SDG-08: Decent work and economic growth |
en_US |
dc.title |
Combating hate : how multilingual transformers can help detect topical hate speech |
en_US |
dc.type |
Article |
en_US |