Latent semantic models : a study of probabilistic models for text in information retrieval

Show simple item record

dc.contributor.advisor De Waal, Alta
dc.contributor.postgraduate Mjali, Siyabonga Zimozoxolo
dc.date.accessioned 2020-03-31T07:21:02Z
dc.date.available 2020-03-31T07:21:02Z
dc.date.created 2020-09
dc.date.issued 2020
dc.description Mini Dissertation (MSc)--University of Pretoria, 2020. en_ZA
dc.description.abstract Large volumes of text is being generated every minute which necessitates effective and robust tools to retrieve relevant information. Supervised learning approaches have been explored extensively for this task, but it is difficult to secure large collections of labelled data to train this set of models. Since a supervised approach is too expensive in terms of annotating data, we consider unsupervised methods such as topic models and word embeddings in order to represent corpora in lower dimensional semantic spaces. Furthermore, we investigate different distance measures to capture similarity between indexed documents based on their semantic distributions. These include cosine, soft cosine and Jensen-Shannon similarities. This collection of methods discussed in this work allows for the unsupervised association of semantic similar texts which has a wide range of applications such as fake news detection, sociolinguistics and sentiment analysis. en_ZA
dc.description.availability Unrestricted en_ZA
dc.description.degree MSc (Mathematical Statistics) en_ZA
dc.description.department Statistics en_ZA
dc.description.sponsorship The Hub Internship en_ZA
dc.description.sponsorship Centre for Artificial Intelligence Research en_ZA
dc.identifier.citation Mjali, SZ 2020, Latent semantic models: A study of probabilistic models for text in information retrieval, Masters mini dissertation, University of Pretoria, Pretoria en_ZA
dc.identifier.other S2020 en_ZA
dc.identifier.uri http://hdl.handle.net/2263/73881
dc.language.iso en en_ZA
dc.publisher University of Pretoria
dc.rights © 2019 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria.
dc.subject UCTD
dc.title Latent semantic models : a study of probabilistic models for text in information retrieval en_ZA
dc.type Mini Dissertation en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record