Rethinking Exploration and Experience Exploitation in Value-Based Multi-Agent Reinforcement Learning

Авторы

Панов А. И. , Борзилов А. В. , Скрынник А. А.

Аннотация

Cooperative Multi-Agent Reinforcement Learning (MARL) focuses on developing strategies to effectively train multiple agents to learn and adapt policies collaboratively. Despite being a relatively new area of research, most MARL methods are based on well-established approaches used in single-agent deep learning tasks due to their proven effectiveness. In this paper, we focus on the exploration problem inherent in many MARL algorithms. These algorithms often introduce new hyperparameters and incorporate auxiliary components, such as additional models, which complicate the adaptation process of the underlying RL algorithm to better fit multi-agent environments. We aim to optimize a deep MARL algorithm with minimal modifications to the well-known QMIX approach. Our investigation of the exploitation-exploration dilemma shows that the performance of state-of-the-art MARL algorithms can be matched by a simple modification of the ϵ -greedy policy. This modification depends on the ratio of available joint actions to the number of agents. We also improve the training aspect of the replay buffer to decorrelate experiences based on recurrent rollouts rather than episodes. The improved algorithm is not only easy to implement, but also aligns with state-of-the-art methods without adding significant complexity. Our approach outperforms existing algorithms in four of seven scenarios across three distinct environments while remaining competitive in the other three.

Внешние ссылки

DOI: 10.1109/ACCESS.2025.3530974

Скачать статью на IEEE Xplore (PDF, англ.): https://ieeexplore.ieee.org/document/10844859

Скачать статью на OpenReview (PDF, англ.): https://openreview.net/pdf?id=RzoxFLA966

Скачать презентацию на сайте Университета Иннополис (PDF): https://innopolis.university/filespublic/icomp/files/Rethinking Exploration and Experience Exploitation in Value-Based Multi-Agent Reinforcement Learning.pdf

ResearchGate: https://www.researchgate.net/publication/388126505_Rethinking_Exploration_and_Experience_Exploitation_in_Value-Based_Multi-Agent_Reinforcement_Learning

Авторы

Аннотация

Внешние ссылки

Ссылка при цитировании