Interpretable machine learning in natural language processing for misinformation data

Interpretable machine learning in natural language processing for misinformation data

Files

Yolanda_Nkalashe_Dissertation_Corrections.pdf (50.22 MB)

Date

2022-11

Publisher

University of Pretoria

Abstract

The interpretability of models has been one of the focal research topics in the machine learning community due to a rise in the use of black box models and complex state-of-the-art models [6]. Most of these models are debugged through trial and error, based on end-to-end learning [7, 48]. This creates some uneasiness and distrust among the end-user consumers of the models, which has resulted in limited use of black box models in disciplines where explainability is required [33]. However, alternative models, ”white-box models,” come with a trade-off of accuracy and predictive power [7]. This research focuses on interpretability in natural language processing for misinformation data. First, we explore example-based techniques through prototype selection to determine if we can observe any key behavioural insights from a misinformation dataset. We use four prototype selection techniques: Clustering, Set Cover, MMD-critic, and Influential examples. We analyse the quality of each technique’s prototype set and use two prototype sets that have the optimal quality to further process for word analysis, linguistic characteristics, and together with the LIME technique for interpretability. Secondly, we compare if there are any critical insights in the South African disinformation context.

Description

Mini Dissertation (MIT (Big Data Science))--University of Pretoria, 2022.

Keywords

UCTD, Disinformation, Interpretability, Prototypes, Example-based, Interpretable Machine Learning, Natural Language Processing

Citation

*

URI

http://hdl.handle.net/2263/92768

Collections

Theses and Dissertations (University of Pretoria)
Theses and Dissertations (Computer Science)

Full item page

Interpretable machine learning in natural language processing for misinformation data

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Sustainable Development Goals

Citation

URI

Collections