Object-Oriented Decomposition of World Model in Reinforcement Learning

Authors

Panov A. Ugadyarov L.

Annotation

Object-oriented models are expected to have better generalization abilities and operate on a more compact state representation. Recent studies have shown that using pre-trained object-centric representation learning models for state factorization in model-free algorithms improves the efficiency of policy learning. Approaches using object-factored world models to predict the environment dynamics have also shown their effectiveness in object-based grid-world environments. Following those works, we propose a novel object-oriented model-based value-based reinforcement learning algorithm Object Oriented Q-network (OOQN) employing an object-oriented decomposition of the world and state-value models. The results of the experiments demonstrate that the developed algorithm outperforms state-of-the-art model-free policy gradient algorithms and model-based value-based algorithm with a monolithic world model in tasks where individual dynamics of the objects is similar.

External links

Download PDF at the IJCAI 2023 workshop website: https://nsa-wksp.github.io/assets/papers/Object-Oriented%20Decomposition%20of%20World%20Model%20in%20Reinforcement%20Learning.pdf

Reference link

Leonid Ugadiarov, Aleksandr Panov. Object-Oriented Decomposition of World Model in Reinforcement Learning // NSA: Neuro-Symbolic Agents Workshop. Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI 2023). (Macao, S.A.R., 19–25 August 2023).