Abstract:
In most applications, anomaly detection operates in an unsupervised mode by looking for outliers hoping that they are anomalies. Unfortunately, most anomaly detectors do not come with explanations about which features make a detected outlier point anomalous. Therefore, it requires human analysts to manually browse through each detected outlier point’s feature space to obtain the subset of features that will help them determine whether they are genuinely anomalous or not. This paper introduces sequential explanation (SE) methods that sequentially explain to the analyst which features make the detected outlier anomalous. We present two methods for computing SEs called the outlier and sample-based SE that will work alongside any anomaly detector. The outlier-based SE methods use an anomaly detector’s outlier scoring measure guided by a search algorithm to compute the SEs. Meanwhile, the sample-based SE methods employ sampling to turn the problem into a classical feature selection problem. In our experiments, we compare the performances of the different outlier- and sample-based SEs. Our results show that both the outlier and sample-based methods compute SEs that perform well and outperform sequential feature explanations.