Biologically plausible models of learning may provide a crucial insight for building autonomous intelligent agents capable of performing a wide range of tasks. In this work, we propose a hierarchical model of an agent operating in an unfamiliar environment driven by a reinforcement signal. We use temporal memory to learn sparse distributed representation of state-actions and the basal ganglia model to learn effective action policy on different levels of abstraction. The learned model of the environment is utilized to generate an intrinsic motivation signal, which drives the agent in the absence of the extrinsic signal, and through acting in imagination, which we call dreaming. We demonstrate that the proposed architecture enables an agent to effectively reach goals in grid environments.
DOI: 10.1186/s40708-022-00156-6
Peter Kuderov's presentaton from the Center for Cognitive Modeling MIPT channel:
Download PDF or read online at Brain Informatics: https://braininformatics.springeropen.com/articles/10.1186/s40708-022-00156-6
Download PDF or read online at the United States National Library of Medicine: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8976870/
Download PDF or read online at Europe PubMed Central: http://europepmc.org/article/MED/35366128
Dzhivelikian, E., Latyshev, A., Kuderov, P. et al. Hierarchical intrinsically motivated agent planning behavior with dreaming in grid environments. Brain Inf. 9, 8 (2022). https://doi.org/10.1186/s40708-022-00156-6