dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Hauptmann, Hanna | |
dc.contributor.author | Ben, Sinie van der | |
dc.date.accessioned | 2024-02-15T14:57:01Z | |
dc.date.available | 2024-02-15T14:57:01Z | |
dc.date.issued | 2024 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/45996 | |
dc.description.abstract | In the current digital society, network sizes continue to expand to meet the demand for digitalisation. This expansion is not without security risks. Network Anomaly detection (NAD) is concerned with the detection of ma- licious traffic in a network. Within the field of NAD, applications that use machine learning to protect the network are on the rise. This rise is accom- panied by a request for explanations about the model’s decisions. Currently, there are studies that apply Explainable Artificial Intelligence in NAD, but they often lack proper evaluations of proposed explanations. This thesis con- tributes to the gap by investigating objective properties for explanations in the field of NAD.
Three models are trained on the UNSW-NB15 dataset: Random Forests (RF), Explainable Boosting Machine (EBM) and Autoencoder (AE). Differ- ent versions of the dataset are created, and hyperparameter tuning is per- formed to find the optimal version of a binary and multiclass classifier. Sev- eral models are selected and combined with explanations. LIME and SHAP are chosen as model-agnostic explanation methods, while the EBM is inher- ently explainable. Predictions of the AE are explained by a personalised method, as there is no universal way of explaining the model yet. The ex- planations are evaluated on two objective properties, namely sensitivity and fidelity.
The results show that the RF outperforms the other two models in binary classification, given model performances only. The binary EBM shows the highest fidelity metrics. For the multiclass classification problem, the RFs trained on a balanced dataset show the best performance, although the values of the objective properties are not optimal.
The final combination of a model with an explanation depends on the importance placed on model performance and each objective explanation property. Aspects such as the format of the dataset or the model hyper- parameters influence the model performance and can affect the explanation quality. Explanation quality appears to depend on the dataset, confirming earlier research.
Future research should incorporate different models, explanations and objective properties to extend this research and generate more insights in the objective properties for explanations in the field of NAD. | |
dc.description.sponsorship | Utrecht University | |
dc.language.iso | EN | |
dc.subject | In the field of Network Anomaly Detection (NAD), applications of deep learning have risen. Similar to other domains, there have been attempts of AI for NAD. Most of these attempts lack evaluation, objectively and subjectively. This thesis contributes to the objective evaluation of XAI for NAD by exploring fidelity and sensitivity properties of explanations | |
dc.title | The Quantitative Analysis of Explainable AI for Network Anomaly Detection | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | XAI, NAD, objective evaluation, objectives, | |
dc.subject.courseuu | Artificial Intelligence | |
dc.thesis.id | 21581 | |