The explainability of gradient-boosted decision trees for digital elevation model (dem) error prediction
Loading...
Date
Authors
Okolie, Chukwuma
Mills, Jon
Adeleke, Adedayo
Smit, Julian
Maduako, Ikechukwu
Journal Title
Journal ISSN
Volume Title
Publisher
International Society of Photogrammetry and Remote Sensing
Abstract
Gradient boosted decision trees (GBDTs) have repeatedly outperformed several machine learning and deep learning algorithms in competitive data science. However, the explainability of GBDT predictions especially with earth observation data is still an open issue requiring more focus by researchers. In this study, we investigate the explainability of Bayesian-optimised GBDT algorithms for modelling and prediction of the vertical error in Copernicus GLO-30 digital elevation model (DEM). Three GBDT algorithms are investigated (extreme gradient boosting - XGBoost, light boosting machine – LightGBM, and categorical boosting – CatBoost), and SHapley Additive exPlanations (SHAP) are adopted for the explainability analysis. The assessment sites are selected from urban/industrial and mountainous landscapes in Cape Town, South Africa. Training datasets are comprised of eleven predictor variables which are known influencers of elevation error: elevation, slope, aspect, surface roughness, topographic position index, terrain ruggedness index, terrain surface texture, vector roughness measure, forest cover, bare ground cover, and urban footprints. The target variable (elevation error) was calculated with respect to accurate airborne LiDAR. After model training and testing, the GBDTs were applied for predicting the elevation error at model implementation sites. The SHAP plots showed varying levels of emphasis on the parameters depending on the land cover and terrain. For example, in the urban area, the influence of vector ruggedness measure surpassed that of first-order derivatives such as slope and aspect. Thus, it is recommended that machine learning modelling procedures and workflows incorporate model explainability to ensure robust interpretation and understanding of model predictions by both technical and non-technical users.
Description
Keywords
Shapley additive explanations, Extreme gradient boosting, Categorical boosting, Light boosting machine, Machine learning explainability, Gradient boosted decision trees (GBDTs), SDG-09: Industry, innovation and infrastructure
Sustainable Development Goals
SDG-09: Industry, innovation and infrastructure
Citation
Okolie, C., Mills, J., Adeleke, A., Smit, J., and Maduako, I.: The explainability of gradient-boosted decision trees for digital elevation model (dem) error prediction, International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, XLVIII-M-3-2023, 161–168, https://doi.org/10.5194/isprs-archives-XLVIII-M-3-2023-161-2023, 2023.