The paper examines the methods for discourse parsing for the Russian language within the framework of rhetorical structure theory. The development of a new corpus for full-text parsing of Russian-language texts of various genres is described. The applicability of various pre-trained encoding language models for rhetorical analysis using two Russian-language corpora is analyzed. We propose a method for training neural network models on a mix of expert-annotated data for rhetorical parsing. This approach allows the models to parse the texts effectively regardless of variations in rhetorical relation sets used in different corpora. It is evaluated on the two large multi-genre corpora of rhetorical annotation for the Russian language.
At the Steklov Mathematical Institute RAS: https://www.mathnet.ru/eng/iipr609
Chistova, Elena. Methods for Rhetorical Structure Parsing in Russian // Artificial Intelligence and Decision Making, 2024, Issue 4, pp. 79–92.