The Hybrid Method for Accurate Patent Classification

Authors

Yadryntsev V. Sochenkov I.

Annotation

This article is dedicated to stacking of two approaches of patent classification. First is based on linguistically-supported k-nearest neighbors algorithm using the method of search for topically similar documents based on a comparison of vectors of lexical descriptors. Second is the word embeddings based fastText, where the sentence (or a document) vector is obtained by averaging the n-gram embeddings, and then a multinomial logistic regression exploits these vectors as features. We show in Russian and English datasets that stacking classifier shows better results compared to single classifiers.

External links

DOI: http://dx.doi.org/10.1134/S1995080219110325

PDF at SpringerLink: https://link.springer.com/content/pdf/10.1134/S1995080219110325.pdf

ResearchGate: https://www.researchgate.net/publication/337566580_The_Hybrid_Method_for_Accurate_Patent_Classification

RUDN University. Repository: https://repository.rudn.ru/en/records/article/record/54931/

Semantic Scholar: https://api.semanticscholar.org/CorpusID:213101916

Reference link

Yadrintsev V. V., Sochenkov I. V. The Hybrid Method for Accurate Patent Classification // Lobachevskii Journal of Mathematics, 2019, Volume 40, Issue 11, pp 1873–1880.