Применениние моделирования оппонента и моделирования среды к децентрализованному мультиагентному обучению с подкреплением

Авторы

Панов А. И. Скрынник А. А.

Аннотация

Multi-agent reinforcement learning (MARL) has recently gained popularity and achieved much success in different kind of games such as zero-sum, cooperative or general-sum games. Nevertheless, the vast majority of modern algorithms assume information sharing during training and, hence, could not be utilised in decentralised applications as well as leverage high-dimensional scenarios and be applied to applications with general or sophisticated reward structure. Thus, due to collecting expenses and sparsity of data in real-world applications it becomes necessary to use world models to model the environment dynamics using latent variables --- i.e. use world model to generate synthetic data for training of MARL algorithms. Therefore, focusing on the paradigm of decentralised training and decentralised execution, we propose an extension to the model-based reinforcement learning approaches leveraging fully decentralised training with planning conditioned on neighbouring co-players' latent representations. Our approach is inspired by the idea of opponent modelling. The method makes the agent learn in joint latent space without need to interact with the environment. We suggest the approach as proof of concept that decentralised model-based algorithms are able to emerge collective behaviour with limited communication during planning, and demonstrate its necessity on iterated matrix games and modified versions of StarCraft Multi-Agent Challenge (SMAC).

Внешние ссылки

DOI: 10.2139/ssrn.4959804

Скачать препринт (PDF) или читать онлайн на SSRN (англ.): https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4959804

ResearchGate: https://www.researchgate.net/publication/384123943_Applying_Opponent_and_Environment_Modelling_in_Decentralised_Multi-Agent_Reinforcement_Learning

Смотреть презенатцию Александра Чернявского на канале Центра когнитивного моделирования МФТИ (с 1:15:35):

Ссылка при цитировании

Chernyavskiy, Alexander, Panov, Aleksandr, and Skrynnik, Aleksey. (2024) Applying Opponent and Environment Modelling in Decentralised Multi-Agent Reinforcement Learning // Available at SSRN.