The paper examines the hypothesis of the applicability of neural autoencoders as a method of vector compression in the pipeline of approximate nearest neighbor search. The evaluation was conducted on several large datasets using various autoencoder architectures and indexes. It has been demonstrated that, although none of the combinations of autoencoders and indexes can fully outperform pure solutions, in some cases, they can be useful. Additionally, we have identified some empirical relationships between the optimal dimensionality of the hidden layer and the internal dimensionality of the datasets. It has also been shown that the loss function is a determining factor for compression quality.
DOI: 10.15514/ISPRAS-2024-36(1)-1
Download the article (PDF) or read online at the official website (in Russian): https://ispranproceedings.elpub.ru/jour/article/view/1685
Download the journal (PDF) or read online at the official website (in Russian): https://ispranproceedings.elpub.ru/jour/issue/viewIssue/116/154
Download the article (PDF) from the Institute for System Programming website (in Russian): https://www.ispras.ru/en/proceedings/isp_36_2024_1/isp_36_2024_1_7/
Download the article (PDF) from eLibrary (in Russian, registration required): https://elibrary.ru/item.asp?id=67209307
Buyanov I. O., Yadrinsev V. V., Sochenkov I. V. Neural vector compression in Approximate Nearest Neighbor Search on Large Datasets // Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS). 2024;36(1):7-22.