Optimizing configurations to get the best of the cloud
Summary
Hosting a robust and high-performance Software as a Service solution requires resources and efficient usage of those resources. This relies on the infrastructure where the application is running on. If the usage of a SaaS solution gradually scales overtime the infrastructure can be managed on-the-fly. However if it is a new product and it is known that there will be a lot of users from the start, the infrastructure needs to be prepared for the load. The effectiveness of a configuration for the infrastructure can be measured in multiple objectives such as: costs, performance and robustness.
In this research we create insight into how the configuration of the infrastructure influences the objectives and present the decision maker with the best options to make a well informed trade-off, this is achieved with the use of a Pareto front.
We have created a configuration for hardware and orchestrator with 13 parameters. The focus of these parameters are the objectives. To create the Pareto front we use the non-dominating sorting genetic algorithm, this requires a fast evaluation of the configurations. We use heuristics to speed up the evaluation. Training these heuristics requires training data, to make sure the training data covers the possibilities evenly and with few samples we use a sampling strategy. The nearly orthogonal Latin hypercube sampling design is created to fulfil these properties, following this design we got 65 sample points.
The sample points are evaluated with a load test to obtain the measurements of the output variables. With this training data set the heuristics were trained, five additional samples were taken and used as a validation set. The heuristic that approximates the reality most accurately is used in the non-dominated sorting genetic algorithm. This algorithm creates the Pareto front, on the front analysis can be performed and a trade-off can be made.