The coronavirus disease (COVID-19), caused by the SARS-CoV-2 virus, was declared a pandemic
by the World Health Organization (WHO) in February 2020. Currently, there are no vaccines or
treatments that have been approved after clinical trials. Social distancing measures, including
travel bans, school closure, and quarantine applied to countries or regions are being used to limit
the spread of the disease, and the demand on the healthcare infrastructure. The seclusion of
groups and individuals has led to limited access to accurate information. To update the public,
especially in South Africa, announcements are made by the minister of health daily. These
announcements narrate the confirmed COVID-19 cases and include the age, gender, and travel
history of people who have tested positive for the disease. Additionally, the South African
National Institute for Communicable Diseases updates a daily infographic summarising the
number of tests performed, confirmed cases, mortality rate, and the regions affected. However,
the age of the patient and other nuanced data regarding the transmission is only shared in the
daily announcements and not on the updated infographic. To disseminate this information, the
Data Science for Social Impact research group at the University of Pretoria, South Africa, has
worked on curating and applying publicly available data in a way that is computer readable so
that information can be shared to the public – using both a data repository and a dashboard.
Through collaborative practices, a variety of challenges related to publicly available data in
South Africa came to the fore. These include shortcomings in the accessibility, integrity, and
data management practices between governmental departments and the South African public.
In this paper, solutions to these problems will be shared by using a publicly available data
repository and dashboard as a case study.