Ransomware detection using stacked autoencoder for feature selection

Nkongolo, Mike Nkongolo Wa; Tokmak, Mahmut

Ransomware detection using stacked autoencoder for feature selection

dc.contributor.author	Nkongolo, Mike Nkongolo Wa
dc.contributor.author	Tokmak, Mahmut
dc.contributor.email	mike.wankongolo@up.ac.za	en_US
dc.date.accessioned	2024-10-25T07:42:28Z
dc.date.available	2024-10-25T07:42:28Z
dc.date.issued	2024-03
dc.description	DATASET AND CODE AVAILABILITY : https://www.kaggle.com/dsv/7172543	en_US
dc.description.abstract	In response to the escalating malware threats, we propose an advanced ransomware detection and classification method. Our approach combines a stacked autoencoder for precise feature selection with a Long Short-Term Memory classifier which significantly enhances ransomware stratification accuracy. The process involves thorough preprocessing of the UGRansome dataset, training an unsupervised stacked autoencoder for optimal feature selection, and fine-tuning via supervised learning to elevate the Long Short- Term Memory model's classification capabilities. We meticulously analysed the autoencoder's learned weights and activations to pinpoint essential features for distinguishing 17 ransomware families from other malware and created a streamlined feature set for precise classification. Our results demonstrate the exceptional performance of the stacked autoencoder-based Long Short-Term Memory model across the 17 ransomware families. This model exhibits high precision, recall, and F1 score values. Furthermore, balanced average scores affirm its ability to generalize effectively across various malware types. To optimise the proposed model, we conducted extensive experiments, including up to 400 epochs, and varying learning rates and achieved an exceptional 98.5% accuracy in ransomware classification. These results surpass traditional machine learning classifiers. Moreover, the proposed model surpasses the Extreme Gradient Boosting (XGBoost) algorithm, primarily due to its effective stacked autoencoder feature selection mechanism and demonstrates outstanding performance in identifying signature attacks with a 98.5% accuracy rate. This result outperforms the XGBoost model, which achieved a 95.5% accuracy rate in the same task. In addition, a prediction of the ransomware financial impact using the proposed model reveals that while Locky, SamSam, and WannaCry still incur substantial cumulative costs, their attacks may not be as financially damaging as those of NoobCrypt, DMALocker, and EDA2.	en_US
dc.description.department	Informatics	en_US
dc.description.librarian	hj2024	en_US
dc.description.sdg	SDG-09: Industry, innovation and infrastructure	en_US
dc.description.sponsorship	The University of Pretoria's Faculty of Engineering, Built Environment, and Information Technology.	en_US
dc.description.uri	http://section.iaesonline.com/index.php/IJEEI/index	en_US
dc.identifier.citation	Nkongolo, M.N.W. & Tokmak, M. 2024, 'Ransomware detection using stacked autoencoder for feature selection', Indonesian Journal of Electrical Engineering and Informatics, vol. 12, no. 1, pp. 142-170, doi : 10.52549/ijeei.v12i1.5109.	en_US
dc.identifier.issn	2089-3272
dc.identifier.other	10.52549/ijeei.v12i1.5109
dc.identifier.uri	http://hdl.handle.net/2263/98772
dc.language.iso	en	en_US
dc.publisher	Institute of Advanced Engineering and Science	en_US
dc.rights	© 2024 Institute of Advanced Engineering and Science. All rights reserved.This work is licensed under a Creative Commons Attribution 4.0 International License.	en_US
dc.subject	Ransomware classification	en_US
dc.subject	Ransomware profiling	en_US
dc.subject	UGRansome dataset	en_US
dc.subject	Cryptology	en_US
dc.subject	Cyberintelligence	en_US
dc.subject	Autoencoder weights	en_US
dc.subject	Machine learning	en_US
dc.subject	Ensemble learning	en_US
dc.subject	Deep learning	en_US
dc.subject	Supervised learning	en_US
dc.subject	Feature selection	en_US
dc.subject	Malware threats	en_US
dc.subject	Signature attacks	en_US
dc.subject	Intrusion detection	en_US
dc.subject	XGBoost	en_US
dc.subject	Long short-term memory (LSTM)	en_US
dc.subject	Stacked autoencoder	en_US
dc.subject	SDG-09: Industry, innovation and infrastructure	en_US
dc.title	Ransomware detection using stacked autoencoder for feature selection	en_US
dc.type	Article	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Nkongolo_Ransomware_2024.pdf
Size:: 1.83 MB
Format:: Adobe Portable Document Format
Description:: Article

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Research Articles (Informatics)
Research Articles (University of Pretoria)

Simple item page