Automated playtesting on 2D video games. An agent-based approach on NethackClone via Iv4XR Framework
MetadataShow full item record
In the current project we present our study on automated video game testing. For our research, we apply our approach of automated agent-based testing, on NethackClone, a 2D, grid-based video game. Our implementation utilizes the Iv4xr framework, a tool that is able to apply and generalize automated testing on multiple types of video games, enabling us in that way to perform agent-based testing on the game, by creating agents and assigning goals to them. Alongside the testing tasks we implemented for our project, we also perform a number of checks on the SUT, checking whether the game behaves as intended when specific actions take place in it. Checks are related to the interaction between the player and the main elements of the game. We created two different tests, with 7 goals and more than 25 actions, tactics and utilities, running our experiments on a total of 307 unique test cases (171 on Test 1, 136 for Test 2). We evaluated our approach based on 3 main factors: coverage, success ratio and time, while the time and eort the framework needs to adapt for a new game each time is also of interest to us. Results derived through the experiments proved not only that our approach performs efficiently at a considerable level, but also our system was even able to detect an actual, unknown bug in the game. The functionality and the ability of the framework to adjust and generalize for multiple games is also promising, considering factors such as updates and adjustments on a game, or similarities between video games. The eort and time we devoted to the framework proved out to be a one-time investment, as once the integration of the SUT into the framework is complete, it can be repeatedly used for creating new testing tasks, checking on different assets of the game. In this way it can assist testers save important time and eort in further, future tests on the same SUT. However, our study also pointed out existing malfunctions in our approach, since our research was limited in terms of time and computational power, proving the need for extended research on a huge number of tests and test cases, in possible future studies.