In order to study global experience for legislation changing and rule-making necessitates, tools for information retrieval of regulatory documents written in different languages become increasingly necessary. One of the aspects of information identification is retrieval of thematically similar documents for a given input document. In this context, an important task of cross-lingual search arises when the user of an information system specifies a reference document in one language, and the search results contain relevant documents in other languages. The article describes different approaches to solving this problem: from classic mediator-based methods to more modern solutions, based on distributional semantics. The test collection used in the study was taken from the United Nations Digital Library, which provides legal documents in both the original English and their Russian translations.
DOI: 10.3103/S0147688223050167
Download PDF from the Scientific and Technical Information Processing journal website: https://link.springer.com/content/pdf/10.3103/S0147688223050167.pdf
Zhebel, V. V., Devyatkin, D. A., Zubarev, D. V., Sochenkov, I. V. Approaches to Cross-Language Retrieval of Similar Legal Documents Based on Machine Learning // Scientific and Technical Information Processing, 2023. Vol. 50. Iss. 5. P. 494–499.