In this work, we explore the applicability of autoencoders as a vector compressor in the pipeline of approximate nearest neighbor search. We conduct extensive tests with several autoencoders and indices on several large-scale datasets. The results show that while none of the combinations of autoencoders and index can completely outperform the pure solutions, it might be useful in some cases. We also find some empirical connections with the optimal hidden layer dimension and intrinsic dimensionality of the datasets.
Скачать PDF с сайта конференции (англ.): https://damdid2023.hse.ru/mirror/pubs/share/867942008.pdf
Igor Buyanov, Vasiliy Yadrintsev, and Ilya Sochenkov. Using Autoencoders to Improve Nearest Neighbor Search on Large Datasets // Data Analytics and Management in Data Intensive Domains, DAMDID/RCDL 2023 (HSE University, Moscow October 24-27, 2023).