Ransomware detection using stacked autoencoder for feature selection

dc.contributor.authorNkongolo, Mike Nkongolo Wa
dc.contributor.authorTokmak, Mahmut
dc.contributor.emailmike.wankongolo@up.ac.zaen_US
dc.date.accessioned2024-10-25T07:42:28Z
dc.date.available2024-10-25T07:42:28Z
dc.date.issued2024-03
dc.descriptionDATASET AND CODE AVAILABILITY : https://www.kaggle.com/dsv/7172543en_US
dc.description.abstractIn response to the escalating malware threats, we propose an advanced ransomware detection and classification method. Our approach combines a stacked autoencoder for precise feature selection with a Long Short-Term Memory classifier which significantly enhances ransomware stratification accuracy. The process involves thorough preprocessing of the UGRansome dataset, training an unsupervised stacked autoencoder for optimal feature selection, and fine-tuning via supervised learning to elevate the Long Short- Term Memory model's classification capabilities. We meticulously analysed the autoencoder's learned weights and activations to pinpoint essential features for distinguishing 17 ransomware families from other malware and created a streamlined feature set for precise classification. Our results demonstrate the exceptional performance of the stacked autoencoder-based Long Short-Term Memory model across the 17 ransomware families. This model exhibits high precision, recall, and F1 score values. Furthermore, balanced average scores affirm its ability to generalize effectively across various malware types. To optimise the proposed model, we conducted extensive experiments, including up to 400 epochs, and varying learning rates and achieved an exceptional 98.5% accuracy in ransomware classification. These results surpass traditional machine learning classifiers. Moreover, the proposed model surpasses the Extreme Gradient Boosting (XGBoost) algorithm, primarily due to its effective stacked autoencoder feature selection mechanism and demonstrates outstanding performance in identifying signature attacks with a 98.5% accuracy rate. This result outperforms the XGBoost model, which achieved a 95.5% accuracy rate in the same task. In addition, a prediction of the ransomware financial impact using the proposed model reveals that while Locky, SamSam, and WannaCry still incur substantial cumulative costs, their attacks may not be as financially damaging as those of NoobCrypt, DMALocker, and EDA2.en_US
dc.description.departmentInformaticsen_US
dc.description.librarianhj2024en_US
dc.description.sdgSDG-09: Industry, innovation and infrastructureen_US
dc.description.sponsorshipThe University of Pretoria's Faculty of Engineering, Built Environment, and Information Technology.en_US
dc.description.urihttp://section.iaesonline.com/index.php/IJEEI/indexen_US
dc.identifier.citationNkongolo, M.N.W. & Tokmak, M. 2024, 'Ransomware detection using stacked autoencoder for feature selection', Indonesian Journal of Electrical Engineering and Informatics, vol. 12, no. 1, pp. 142-170, doi : 10.52549/ijeei.v12i1.5109.en_US
dc.identifier.issn2089-3272
dc.identifier.other10.52549/ijeei.v12i1.5109
dc.identifier.urihttp://hdl.handle.net/2263/98772
dc.language.isoenen_US
dc.publisherInstitute of Advanced Engineering and Scienceen_US
dc.rights© 2024 Institute of Advanced Engineering and Science. All rights reserved.This work is licensed under a Creative Commons Attribution 4.0 International License.en_US
dc.subjectRansomware classificationen_US
dc.subjectRansomware profilingen_US
dc.subjectUGRansome dataseten_US
dc.subjectCryptologyen_US
dc.subjectCyberintelligenceen_US
dc.subjectAutoencoder weightsen_US
dc.subjectMachine learningen_US
dc.subjectEnsemble learningen_US
dc.subjectDeep learningen_US
dc.subjectSupervised learningen_US
dc.subjectFeature selectionen_US
dc.subjectMalware threatsen_US
dc.subjectSignature attacksen_US
dc.subjectIntrusion detectionen_US
dc.subjectXGBoosten_US
dc.subjectLong short-term memory (LSTM)en_US
dc.subjectStacked autoencoderen_US
dc.subjectSDG-09: Industry, innovation and infrastructureen_US
dc.titleRansomware detection using stacked autoencoder for feature selectionen_US
dc.typeArticleen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Nkongolo_Ransomware_2024.pdf
Size:
1.83 MB
Format:
Adobe Portable Document Format
Description:
Article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: