Using diverse data sources to impute missing air quality data collected in a resource-limited setting

dc.contributor.authorKebalepile, Moses Mogakolodi
dc.contributor.authorDzikiti, Loveness Nyaradzo
dc.contributor.authorVoyi, Kuku
dc.date.accessioned2024-08-14T07:47:01Z
dc.date.available2024-08-14T07:47:01Z
dc.date.issued2024-03
dc.descriptionDATA AVAILABILITY STATEMENT: Environmental data were available through the municipal offices of the cities and can be requested. The disclosure on the use of the data is a requirement of the cities. Data can be requested on behalf of cities from the Department of Environmental Affairs (DoEA). The data custodian for the DoEA is the South African Air Quality Services (SAAQS), and data can be requested through the SAAQS website using the following website link: Saaqis (environment.gov.za (accessed on 23 September 2022)).en_US
dc.description.abstractThe sustainable operation of ambient air quality monitoring stations in developing countries is not always possible. Intermittent failures and breakdowns at air quality monitoring stations often affect the continuous measurement of data as required. These failures and breakdowns result in missing data. This study aimed to impute NO2 , SO2 , O3 , and PM 10 to produce complete data sets of daily average exposures from 2010 to 2017. Models were built for (a) an individual pollutant at a monitoring station, (b) a combined model for the same pollutant from different stations, and (c) a data set with all the pollutants from all the monitoring stations. This study sought to evaluate the efficacy of the Multiple Imputation by Chain Equations (MICE) algorithm in successfully imputing air quality data that are missing at random. The application of classification and regression trees (CART) analysis using the MICE package in the R statistical programming language was compared with the predictive mean matching (PMM) method. The CART method performed better, with the pooled R-squared statistics of the imputed data ranging from 0.3 to 0.7, compared to a range of 0.02 to 0.25 for PMM. The MICE algorithm successfully resolved the incompleteness of the data. It was concluded that the CART method produced better reliable data than the PMM method. However, in this study, the pooled R2 values were accurate for NO2 , but not so much for other pollutants.en_US
dc.description.departmentSchool of Health Systems and Public Health (SHSPH)en_US
dc.description.sdgSDG-03:Good heatlh and well-beingen_US
dc.description.sdgSDG-11:Sustainable cities and communitiesen_US
dc.description.sdgSDG-13:Climate actionen_US
dc.description.sponsorshipThe South African Medical Research Council.en_US
dc.description.urihttps://www.mdpi.com/journal/atmosphereen_US
dc.identifier.citationKebalepile, M.M.; Dzikiti, L.N.; Voyi, K. Using Diverse Data Sources to Impute Missing Air Quality Data Collected in a Resource-Limited Setting. Atmosphere 2024, 15, 303. https://doi.org/10.3390/atmos15030303.en_US
dc.identifier.issn2073-4433 (online)
dc.identifier.other10.3390/atmos15030303
dc.identifier.urihttp://hdl.handle.net/2263/97617
dc.language.isoenen_US
dc.publisherMDPIen_US
dc.rights© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).en_US
dc.subjectMICE imputationen_US
dc.subjectAir qualityen_US
dc.subjectMissing dataen_US
dc.subjectClassificationen_US
dc.subjectRegression treesen_US
dc.subjectClassification and regression trees (CART)en_US
dc.subjectPredictive mean matching (PMM)en_US
dc.subjectMultivariate imputation by chained equations (MICE)en_US
dc.subjectSDG-03: Good health and well-beingen_US
dc.subjectSDG-11: Sustainable cities and communitiesen_US
dc.subjectSDG-13: Climate actionen_US
dc.titleUsing diverse data sources to impute missing air quality data collected in a resource-limited settingen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kebalepile_Using_2024.pdf
Size:
2.24 MB
Format:
Adobe Portable Document Format
Description:
Article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: