This article presents a new approach to large scale patent classification. The need to classify documents often takes place in professional information retrieval systems. In this paper we describe our approach, based on linguistically-supported k-nearest neighbors. We experimentally evaluate it on the Russian and English datasets and compare modern classification technique fastText. We show that KNN is a viable alternative to traditional text classifiers, achieving comparable accuracy while using less additional hardware resources.
DOI: http://dx.doi.org/10.1088/1742-6596/1117/1/012004
Read at ResearchGate: https://www.researchgate.net/publication/329216402_Fast_and_Accurate_Patent_Classification_in_Search_Engines
RUDN. Repository: https://repository.rudn.ru/en/records/article/record/36232/
Semantic Scholar: https://api.semanticscholar.org/CorpusID:70121299
Yadrintsev, V., Bakarov, A., Suvorov, R., & Sochenkov, I. Fast and Accurate Patent Classification in Search Engines // Journal of Physics: Conference Series. – IOP Publishing, 2018. – Т. 1117. – №. 1. – С. 012004.