Object-Oriented Decomposition of World Model in Reinforcement Learning

Авторы

Панов А. И. Угадяров Л. А.

Аннотация

Object-oriented models are expected to have better generalization abilities and operate on a more compact state representation. Recent studies have shown that using pre-trained object-centric representation learning models for state factorization in model-free algorithms improves the efficiency of policy learning. Approaches using object-factored world models to predict the environment dynamics have also shown their effectiveness in object-based grid-world environments. Following those works, we propose a novel object-oriented model-based value-based reinforcement learning algorithm Object Oriented Q-network (OOQN) employing an object-oriented decomposition of the world and state-value models. The results of the experiments demonstrate that the developed algorithm outperforms state-of-the-art model-free policy gradient algorithms and model-based value-based algorithm with a monolithic world model in tasks where individual dynamics of the objects is similar.

Внешние ссылки

Скачать PDF на сайте воркшопа IJCAI 2023 (англ.): https://nsa-wksp.github.io/assets/papers/Object-Oriented%20Decomposition%20of%20World%20Model%20in%20Reinforcement%20Learning.pdf

Ссылка при цитировании

Леонид Угадяров, Александр Панов. Object-Oriented Decomposition of World Model in Reinforcement Learning // NSA: Neuro-Symbolic Agents Workshop. Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI 2023). (Макао, 19–25 августа 2023 г.).