Proactive autoscaling for performance variable cloud infrastructure
Summary
Cloud computing has transformed the IT landscape by offering computing resources as a utility.
Using the cloud, resources such as virtual machines (VMs) can be rented on-demand by users.
To automatically determine the amount of virtual machines required by a user given a certain application workload, autoscalers exist. On cloud platforms, autoscalers can be used to automatically adjust the capacity of cloud resources in use, to maintain steady and predictable performance. One of the limitations of current autoscalers is that they are often proprietary and not capable of multi-cloud operations, causing vendor lock-in. Additionally, cloud offerings can suffer from variable performance issues which current autoscalers cannot resolve in a cost-effective way. Moreover, they are very limited in their capabilities of scaling within heterogeneous environments consisting of virtual machines composed of differing hardware.
To optimise the autoscaler's behaviour, state-of-the-art approaches use machine learning (ML) techniques to predict future workload requirements and to provide informed decision-making capabilities. Existing state-of-the-art performance variability detection techniques based on machine learning techniques typically focus on analysing application-specific Quality of Service (QoS) objectives. Moreover, no previous research has attempted to integrate performance variability detection techniques within autoscalers or tried to analyse their benefits. Additionally, state-of-the-art autoscalers are often limited in their ability to autoscale in multi-cloud environments due to their proprietary nature. When performance degrades, autoscalers scale up the amount of resources in use without evaluating the performance or the performance variability of individual resources. State-of-the-art autoscalers are unaware of pricing and performance differences in multi-cloud environments. If any multi-cloud support exists, they can usually only scale in one cloud provider, leaving resources at other providers constant.
The core issue of not having performance variability detection integrated into autoscalers, lies in the fact that most state-of-the-art autoscalers primarily focus on scaling the quantity of resources based on QoS objectives. While adept at this task, this overlooks the possibility that temporary QoS violations can be caused by individual cloud resources' performance variability. Autoscalers tend to scale up resources rather than replace underperforming worker nodes, leading to additional cost at the expense of the cloud users, while not addressing the root cause of performance issues. Another limitation of state-of-the-art autoscalers is their inability to scale in multi-cloud environments, especially when this means scaling on a heterogeneous pool of machines. Moreover, cloud users are at an information disadvantage compared to their providers. To facilitate autoscalers as well as cloud users, a comprehensive metric to compare cloud virtual machines of different types is required.
To mitigate the aforementioned challenges, this thesis initially conducted a literature review to evaluate state-of-the-art autoscaling and performance variability detection approaches and their current drawbacks. Additionally, research was conducted on different benchmark approaches to evaluate the performance of different types of virtual machines.
This research considered different performance evaluation and performance variability techniques suitable for inclusion into an autoscaler. Furthermore, a performance variability detection approach based on LSTM was devised to detect performance variability based on low-level system metrics to provide application agnostic versatility. Additionally, a performance evaluation approach was devised based on two novel metrics namely, the performance evaluator performance score (PS) and price-performanc