The advances in unsupervised object-centric representation learning have significantly improved its application to downstream tasks. Recent works highlight that disentangled object representations can aid policy learning in image-based, object-centric reinforcement learning tasks. This paper proposes a novel object-centric reinforcement learning algorithm that integrates actor-critic and model-based approaches by incorporating an object-centric world model within the critic. The world model captures the environment's data-generating process by predicting the next state and reward given the current state-action pair, where actions are interventions in the environment. In model-based reinforcement learning, world model learning can be interpreted as a causal induction problem, where the agent must learn the causal relationships underlying the environment's dynamics. We evaluate our method in a simulated 3D robotic environment and a 2D environment with compositional structure. As baselines, we compare against object-centric, model-free actor-critic algorithms and a state-of-the-art monolithic model-based algorithm. While the baselines show comparable performance in easier tasks, our approach outperforms them in more challenging scenarios with a large number of objects or more complex dynamics.
DOI: 10.48550/arXiv.2310.17178
Скачать PDF из архива издательства ML Research Press (англ.): https://raw.githubusercontent.com/mlresearch/v275/main/assets/ugadiarov25a/ugadiarov25a.pdf
Скачать PDF на arXiv.org (англ.): https://arxiv.org/abs/2310.17178
Leonid Ugadiarov, Vitaly Vorobyov, Aleksandr Panov. Relational Object-Centric Actor-Critic // Proceedings of Machine Learning Research, Vol. 275, 2025, pp. 1450–1476.