Abstract:
Recently, network intrusion attacks, particularly new unknown attacks referred to as zero-day attacks, have become a global phenomenon. Zero-day network intrusion attacks constitute a frequent cybersecurity threat, as they seek to exploit the vulnerabilities of a network system. Previous studies have demonstrated that zero-day attacks can compromise a network for prolonged periods if network traffic analysis (NTA) is not performed thoroughly and efficiently. NTA plays a crucial role in supporting machine learning (ML) based network intrusion detection systems (NIDS) by monitoring and extracting meaningful information from network traffic data. Network traffic data constitute large volumes of data described by features such as destination-to-source packet count. It is important to use only those features that have a significant impact on the performance of an NIDS. The problem is that most existing ML models for NIDS employ features such as Internet protocol (IP) addresses that are redundant for detecting zero-day attacks and therefore negatively impact the performance of these ML models. The solution proposed in this study demonstrates that the law of anomalous numbers, famously known as Benford’s law, is a viable technique that can effectively identify significant network features that are indicative of anomalous behaviour and can be used for detecting zero-day attacks. Finally, our study illustrates that semi-supervised ML approaches are effective for detecting zero-day attacks if significant features are optimally chosen. The experimental results demonstrate that one-class support vector machines achieved the best results (Matthews correlation coefficient of 74% and F1 score of 85%) for detecting zero-day network attacks.