This article is dedicated to stacking of two approaches of patent classification. First is based on linguistically-supported k-nearest neighbors algorithm using the method of search for topically similar documents based on a comparison of vectors of lexical descriptors. Second is the word embeddings based fastText, where the sentence (or a document) vector is obtained by averaging the n-gram embeddings, and then a multinomial logistic regression exploits these vectors as features. We show in Russian and English datasets that stacking classifier shows better results compared to single classifiers.
DOI: http://dx.doi.org/10.1134/S1995080219110325
PDF на сайте SpringerLink (англ.): https://link.springer.com/content/pdf/10.1134/S1995080219110325.pdf
РУДН. Репозиторий: https://repository.rudn.ru/ru/records/article/record/54931/
Yadrintsev V. V., Sochenkov I. V. The Hybrid Method for Accurate Patent Classification // Lobachevskii Journal of Mathematics, 2019, Volume 40, Issue 11, pp 1873–1880.