Social curiosity in deep multi-agent reinforcement learning

Heemskerk, H.C.

View/Open

Social_curiosity_in_deep_multi_agent_reinforcement_learning.pdf (4.007Mb)

Publication date

2020

Author

Heemskerk, H.C.

Metadata

Show full item record

Summary

In Multi-Agent Reinforcement Learning (MARL), social dilemma environments make cooperation hard to learn. It is even harder in the case of decentralized models, where agents do not share model components. Intrinsic rewards have only been partially explored to solve this problem, and training still requires a large amount of samples and thus time. In an attempt to speed up this process, we propose a combination of the two main categories of intrinsic rewards, curiosity and empowerment. We perform experiments in the cleanup and harvest social dilemma environments for several types of models, both with and without intrinsic motivation. We find no conclusive evidence that intrinsic motivation significantly alters experiment outcomes when using the PPO algorithm. We also find that PPO is unable to succeed in the harvest environment. However, for both of these findings we only show this to be the case without hyperparameter tuning.

URI

https://studenttheses.uu.nl/handle/20.500.12932/38059

Collections

Theses