Exploring COVID-19 public perceptions in South Africa through sentiment analysis and topic modelling of Twitter posts

Kekere, Temitope; Marivate, Vukosi; Hattingh, Maria J. (Marie)

UPSpace Home
→
Engineering, Built Environment and Information Technology
→
Informatics
→
Research Articles (Informatics)
→
View Item

We are excited to announce that the repository will soon undergo an upgrade, featuring a new look and feel along with several enhanced features to improve your experience. Please be on the lookout for further updates and announcements regarding the launch date. We appreciate your support and look forward to unveiling the improved platform soon.

Exploring COVID-19 public perceptions in South Africa through sentiment analysis and topic modelling of Twitter posts

Kekere, Temitope; Marivate, Vukosi; Hattingh, Maria J. (Marie)

URI: http://hdl.handle.net/2263/97160

Date: 2023

Abstract:

The narratives shared on social media during a health crisis such as COVID-19 reflect public perceptions of the crisis. This article provides findings from a study of the perceptions of South African citizens regarding the government’s response to the COVID-19 pandemic from March to May 2020. The study analysed Twitter data from posts by government officials and the public in South Africa to measure the public’s confidence in how the government was handling the pandemic. A third of the tweets dataset was labelled using valence aware dictionary and sentiment reasoner (VADER) lexicons, forming the training set for four classical machinelearning algorithms—logistic regression (LR), support vector machines (SVM), random forest (RF), and extreme gradient boosting (XGBoost)—that were employed for sentiment analysis. The effectiveness of these classifiers varied, with error rates of 17% for XGBoost, 14% for RF, and 7% for both SVM and LR. The best-performing algorithm (SVM) was subsequently used to label the remaining two-thirds of the tweet dataset. In addition, the study used, and evaluated the effectiveness of, two topic-modelling algorithms—latent dirichlet allocation (LDA) and non-negative matrix factorisation (NMF)—for classification of the most frequently occurring narratives in the Twitter data. The better-performing of these two algorithms, NMF, identified a prevalence of positive narratives in South African public sentiment towards the government’s response to COVID-19.

Show full item record