Multi-agent pathfinding (MAPF) is a challenging computational problem that typically requires to find collision-free paths for multiple agents in a shared environment. Solving MAPF optimally is NP-hard, yet efficient solutions are critical for numerous applications, including automated warehouses and transportation systems. Recently, learning-based approaches to MAPF have gained attention, particularly those leveraging deep reinforcement learning. Following current trends in machine learning, we have created a foundation model for the MAPF problems called MAPF-GPT. Using imitation learning, we have trained a policy on a set of pre-collected sub-optimal expert trajectories that can generate actions in conditions of partial observability without additional heuristics, reward functions, or communication with other agents. The resulting MAPF-GPT model demonstrates zero-shot learning abilities when solving the MAPF problem instances that were not present in the training dataset. We show that MAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers on a diverse range of problem instances and is efficient in terms of computation (in the inference mode).
DOI: 10.48550/arXiv.2409.00134
Скачать PDF на arXiv.org (англ.): https://arxiv.org/abs/2409.00134
ResearchGate: https://www.researchgate.net/publication/383700406_MAPF-GPT_Imitation_Learning_for_Multi-Agent_Pathfinding_at_Scale
Anton Andreychuk, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik. MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale // arXiv:2409.00134v3, 25 September 2024.