Applying Opponent and Environment Modelling in Decentralised Multi-Agent Reinforcement Learning

Authors

Panov A. , Skrynnik A.

Annotation

Multi-agent reinforcement learning (MARL) has recently gained popularity and achieved much success in different kind of games such as zero-sum, cooperative or general-sum games. Nevertheless, the vast majority of modern algorithms assume information sharing during training and, hence, could not be utilised in decentralised applications as well as leverage high-dimensional scenarios and be applied to applications with general or sophisticated reward structure. Thus, due to collecting expenses and sparsity of data in real-world applications it becomes necessary to use world models to model the environment dynamics using latent variables --- i.e. use world model to generate synthetic data for training of MARL algorithms. Therefore, focusing on the paradigm of decentralised training and decentralised execution, we propose an extension to the model-based reinforcement learning approaches leveraging fully decentralised training with planning conditioned on neighbouring co-players' latent representations. Our approach is inspired by the idea of opponent modelling. The method makes the agent learn in joint latent space without need to interact with the environment. We suggest the approach as proof of concept that decentralised model-based algorithms are able to emerge collective behaviour with limited communication during planning, and demonstrate its necessity on iterated matrix games and modified versions of StarCraft Multi-Agent Challenge (SMAC).

External links

DOI: 10.2139/ssrn.4959804

Download preprint (PDF) or read online at SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4959804

ResearchGate: https://www.researchgate.net/publication/384123943_Applying_Opponent_and_Environment_Modelling_in_Decentralised_Multi-Agent_Reinforcement_Learning

Watch Alexander Chernyavskiy's presentation at the Center for Cognitive Modeling (from 1:15:35):

Authors

Annotation

External links

Reference link