Combining Teaching and Following in Repeated Games
Summary
In recent years, there has been a clear shift from single agent research to multiagent systems. Multiagent systems, in the broad sense, are systems that include multiple autonomous entities with either diverging information or diverging interests, or both. The capacity to learn is a key facet of intelligent behaviour, and it is no surprise that much attention has been devoted to the subject in the various disciplines that study intelligence and rationality. This is the area of multiagent learning, which is primarily composed of two major disciplines – artificial intelligence and game theory. Here game theory can be seen as a tool to model the interactions that can arise between different agents: a game describes a scenario in which agents can perform actions and achieve payoff described by a quantitative payoff function. In order to reach a solution of a game, the agents need to coordinate their respective actions. The problem of coordination between agents is very substantial
and can be seen as one of the major topics within the subject of multiagent learning. In this thesis, we consider the setting in which agents will repeatedly play the stage game, or in other words a repeated game. Moreover, we consider that the agents are not pre-coordinated and have no explicit way of communication (except by implicit observations). From this perspective, the act of proposing (or forcing) an outcome to our adversaries makes sense, which we will informally describe as ‘teaching’ behaviour. On the other hand we have ‘following’ behaviour, which can be understood as the act of going along with such a proposal. Teaching behaviour does not only make sense in order to reach coordination, but often times adopting the role of a teacher allows us to ‘steer’ followers to outcomes that are more beneficial to us. However, it can lead to miscoordination if multiple agents try to teach different outcomes of the game. Without an external designation of these roles, it can be hard to decide whether to take on the role of a teacher or a follower.
In this thesis, we will try to formally define when a strategy can be called a strategy that is both able to teach and follow. To achieve this, we will first take a closer look of the individual behaviours of teaching and following in order to try to formulate a criterion that tries to capture and combine these opposing behaviours into a unified whole. A criterion is a formal requirement an algorithm should adhere to in order to achieve certain (beneficial) properties and many such criteria have already been formulated over recent years. The beneficialness of proposing such a criterion lies in the fact that it remains general enough to analyse and discuss, and specific enough to allow algorithmic implementation.