Actor-Critic Catan: Reinforcement Learning in High-Strategic Environments
Summary
Settlers of Catan is a complex board game that makes a great Reinforcement Learning research environment because of its high strategic nature and a good mixture of competition and cooperation. In this research, Settlers of Catan was digitized using the Unity game engine and Python. Utilizing the versatility of the Actor-Critic methods, Advantage-Actor-Critic a.k.a A2C was used to try tackle a simplified version of Catan by trying to play Catan on a human level, beating Random-Agents on the way. It was found that the A2C agent learns to spend its resources and to avoid no-ops / passing. This was revealed by decreasing loss functions and inspection of individual episodes. However, because of several biases, it does not perform well enough against even the Random-Agents that just perform random legal moves. This is mostly due to the agent not learning how to prioritize specific locations for buildings and the difficulties surrounding the sparse rewards in this game. Promoting curiosity and decentralizing Actors could be utilized in the future to improve results of an RL-powered agent in Settlers of Catan.