Выучивание эвристической функции для задачи планирования с помощью трансформерных моделей

Авторы

Панов А. И. , Яковлев К. С. , Кириленко Д. Е.

Аннотация

Heuristic search algorithms, e.g. A*, are the commonly used tools for pathfinding on grids, i.e. graphs of regular structure that are widely employed to represent environments in robotics, video games etc. Instance-independent heuristics for grid graphs, e.g. Manhattan distance, do not take the obstacles into account and, thus, the search led by such heuristics performs poorly in the obstacle-rich environments. To this end, we suggest learning the instance-dependent heuristic proxies that are supposed to notably increase the efficiency of the search. The first heuristic proxy we suggest to learn is the correction factor, i.e. the ratio between the instance independent cost-to-go estimate and the perfect one (computed offline at the training phase). Unlike learning the absolute values of the cost-to-go heuristic function, which was known before, when learning the correction factor the knowledge of the instance-independent heuristic is utilized. The second heuristic proxy is the path probability, which indicates how likely the grid cell is lying on the shortest path. This heuristic can be utilized in the Focal Search framework as the secondary heuristic, allowing us to preserve the guarantees on the bounded sub-optimality of the solution. We learn both suggested heuristics in a supervised fashion with the state-of-the-art neural networks containing attention blocks (transformers). We conduct a thorough empirical evaluation on a comprehensive dataset of planning tasks, showing that the suggested techniques i) reduce the computational effort of the A* up to a factor of 4x while producing the solutions, which costs exceed the costs of the optimal solutions by less than 0.3% on average; ii) outperform the competitors, which include the conventional techniques from the heuristic search, i.e. weighted A*, as well as the state-of-the-art learnable planners.

Внешние ссылки

DOI: 10.48550/arXiv.2212.11730

Скачать PDF в репозитории arXiv (англ.): https://arxiv.org/abs/2212.11730

Скачать исходный код на GitHub: https://github.com/AIRI-Institute/TransPath

Скачать PDF на официальном сайте конференции (англ.): https://ojs.aaai.org/index.php/AAAI/article/view/26465/26237

Читать на сайте института AIRI (англ.): https://airi-institute.github.io/TransPath/

Смотреть презентацию Даниила Кириленко на канале Центра когнитивного моделирования МФТИ:

Ссылка при цитировании

Daniil Kirilenko, Anton Andreychuk, Aleksandr Panov, Konstantin Yakovlev. TransPath: Learning Heuristics for Grid-Based Pathfinding via Transformers // The 37th AAAI Conference on Artificial Intelligence (AAAI-23). Washington, D.C., USA. February 7-14, 2023.