Metric Selection For Root Cause Analysis of Cloud Infrastructure
Summary
Cryptojacking is a new threat to cloud infrastructure, where cyber attackers
hijack computing resources for unauthorized cryptocurrency mining.
Within containerized environments such as AWS Fargate, detection
efficacy is hindered by the dynamic nature of workloads and subtle behavioural
changes, with existing methods yielding too many false positives
or lack of responsiveness. The challenge is to effectively detect
cryptojacking incidents in contemporary cloud environments from extensive
system statistics alone. This thesis illustrates the combined effect of
time-series anomaly detection techniques (i.e., ARIMA and Holt-Winters)
and demonstrates that machine learning models leveraging carefully engineered
resource features significantly improve the detection of cost anomalies
caused by cryptojacking, especially in cases missed by traditional
time-series methods. This is further supported by their soft-voting ensemble,
which achieves substantial results in precision along with recall
compared to random and single-model baselines. The findings demonstrate
a real-world approach that can be applied widely for flexible and
scalable anomaly detection in the fast-evolving domain of cloud-native
systems.