News classification and categorization with smart function sentiment analysis

Show simple item record

dc.contributor.author Nkongolo, Mike Nkongolo Wa
dc.date.accessioned 2024-01-11T10:30:27Z
dc.date.available 2024-01-11T10:30:27Z
dc.date.issued 2023-11
dc.description DATA AVAILABILITY : The data supporting the current study are available from the corresponding author upon request. en_US
dc.description.abstract Search engines are tools used to find information on the Internet. Since the web has a plethora of websites, the engine queries the majority of active sites and builds a database organized according to keywords utilized in the search. Because of this, when a user types a few descriptive words on the home page of the search engine, the search function lists websites corresponding to these keywords. However, there are some problems with this search approach. For instance, if a user wants information about the word Jaguar, most search results are animals and cars. This is a polysemic problem that forces search engines to always provide the most popular but not the most relevant results. This article presents a study of using sentiment technology to help news classification and categorization and improve the classification accuracy. We have introduced a smart search function embedded into a search engine to tackle polysemic issues and record relevant results to determine their sentimentality. Therefore, this study presents a topic that involves several aspects of natural language processing (NLP) and sentiment analysis for news categorization and classification. A web crawler was used to collect British Broadcasting Corporation (BBC) news across the Internet, carried out preprocessing of text by using NLP, and applied sentiment analysis methods to determine the polarity of the processed text data. The sentimentality represents negative, positive, or neutral polarities assigned by the sentiment analysis algorithms. The research utilized the BBC news site to collect different information using a web crawler and a database to explore the sentimentality of BBC news. The natural language toolkit (NLTK) and BM25 indexed and preprocessed patterns in the database. The experimental results depict the proposed search function surpassing normal search with an accuracy rate of 85%. Moreover, the results show a negative polarity of BBC news using the Sentistrength algorithm. Furthermore, the Valence Aware Dictionary and sEntiment Reasoner (VADER) was the best-performing sentiment analysis model for news classification. This model obtained an accuracy of 85% using data collected with the proposed smart function. en_US
dc.description.department Informatics en_US
dc.description.librarian hj2023 en_US
dc.description.sdg None en_US
dc.description.sponsorship The University of Pretoria’s Faculty of Engineering, Built Environment, and Information Technology through the Doctorate University Capacity Development Program (UCDP). Open access funding was enabled and organized by SANLiC Gold. en_US
dc.description.uri https://www.hindawi.com/journals/ijis en_US
dc.identifier.citation Nkongolo, M.N.W. 2023, 'News classification and categorization with smart function sentiment analysis', International Journal of Intelligent Systems, vol. 2023, art. 1784394 , pp. 1-24, doi : 10.1155/2023/1784394. en_US
dc.identifier.issn 0884-8173 (print)
dc.identifier.issn 1098-111X (online)
dc.identifier.other 10.1155/2023/1784394
dc.identifier.uri http://hdl.handle.net/2263/93927
dc.language.iso en en_US
dc.publisher Hindawi en_US
dc.rights © 2023 Mike Nkongolo Wa Nkongolo. Tis is an open access article distributed under the Creative Commons Attribution License. en_US
dc.subject Valence aware dictionary and sentiment reasoner (VADER) en_US
dc.subject Natural language processing en_US
dc.subject Sentiment analysis en_US
dc.subject News categorization and classification en_US
dc.title News classification and categorization with smart function sentiment analysis en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record