Abstract:
In response to the escalating malware threats, we propose an advanced
ransomware detection and classification method. Our approach combines a
stacked autoencoder for precise feature selection with a Long Short-Term
Memory classifier which significantly enhances ransomware stratification
accuracy. The process involves thorough preprocessing of the UGRansome
dataset, training an unsupervised stacked autoencoder for optimal feature
selection, and fine-tuning via supervised learning to elevate the Long Short-
Term Memory model's classification capabilities. We meticulously analysed
the autoencoder's learned weights and activations to pinpoint essential features
for distinguishing 17 ransomware families from other malware and created a
streamlined feature set for precise classification. Our results demonstrate the
exceptional performance of the stacked autoencoder-based Long Short-Term
Memory model across the 17 ransomware families. This model exhibits high
precision, recall, and F1 score values. Furthermore, balanced average scores
affirm its ability to generalize effectively across various malware types. To
optimise the proposed model, we conducted extensive experiments, including
up to 400 epochs, and varying learning rates and achieved an exceptional
98.5% accuracy in ransomware classification. These results surpass traditional
machine learning classifiers. Moreover, the proposed model surpasses the
Extreme Gradient Boosting (XGBoost) algorithm, primarily due to its
effective stacked autoencoder feature selection mechanism and demonstrates
outstanding performance in identifying signature attacks with a 98.5%
accuracy rate. This result outperforms the XGBoost model, which achieved a
95.5% accuracy rate in the same task. In addition, a prediction of the
ransomware financial impact using the proposed model reveals that while
Locky, SamSam, and WannaCry still incur substantial cumulative costs, their
attacks may not be as financially damaging as those of NoobCrypt,
DMALocker, and EDA2.