- Explainable Artificial Intelligence (XAI) is essential for the acceptance of machine learning (ML) models, especially in critical domains like network security. Administrators need interpretable explanations to validate decisions, yet existing XAI methods often suffer from low consensus, where different techniques yield conflicting explanations. A key factor contributing to this issue is the presence of correlated features, which allows multiple equivalent but divergent explanations. While decorrelation techniques, such as Principal Component Analysis (PCA), can mitigate this, they often reduce interpretability by abstracting original features into complex combinations. This work investigates whether feature decorrelation via decomposition techniques can improve consensus among post-hoc XAI methods in the context of ML-based network intrusion detection (ML-NIDS). Using both NIDS and synthetic data, we analyze the effect of decorrelation across different models and preprocessing. WeExplainable Artificial Intelligence (XAI) is essential for the acceptance of machine learning (ML) models, especially in critical domains like network security. Administrators need interpretable explanations to validate decisions, yet existing XAI methods often suffer from low consensus, where different techniques yield conflicting explanations. A key factor contributing to this issue is the presence of correlated features, which allows multiple equivalent but divergent explanations. While decorrelation techniques, such as Principal Component Analysis (PCA), can mitigate this, they often reduce interpretability by abstracting original features into complex combinations. This work investigates whether feature decorrelation via decomposition techniques can improve consensus among post-hoc XAI methods in the context of ML-based network intrusion detection (ML-NIDS). Using both NIDS and synthetic data, we analyze the effect of decorrelation across different models and preprocessing. We find that decorrelation can significantly improve consensus, but its effectiveness is highly dependent on the underlying model, preprocessing, and dataset characteristics. We also explore sparsity-inducing variants of PCA to partially recover interpretability, though results vary depending on the level of sparsity enforced.…

