The purpose of the paper is to investigate cues signalling the relations between discourse units in Russian. Building a lexicon of discourse connectives is an indispensable subtask in many discourse parsing applications as well as an essential issue in theoretical researches of text coherence. In order to develop such a resource for Russian, we have conducted a corpus-based study of discourse connectives that were manually extracted from the Russian Rhetorical Structure Treebank (Ru-RSTreebank). The Treebank includes 79 texts annotated within the RST framework (Mann, Thompson 1988). In order to provide a deeper analysis of connectives in Russian, we focus on causal relations only, namely, the ‘Cause-Effect’ relation. Some of the connectives (primary connectives) are enumerated in grammars and dictionaries. They primarily mark the intra-sentential relations. However, there is an expansive class of less grammaticalized items (secondary connectives) that have received less attention till now. Some of them are based on content words (e.g. по причине ‘for the cause’). Secondary connectives often serve as linking devices for inter-sentential relations. We suggest a scheme for connectives annotation for Russian. We specify the basic patterns that can be used for less-grammaticalized connectives mining in an unannotated corpus. Besides, we provide the comparison of two classes of connectives (primary vs. secondary ones). Our research has shown that these two classes differ in their properties. There is a statistically significant difference between them with respect to the nucleus/satellite position, intra- vs. inter-sentential relations and some others.
PDF at the Dialogue international conference: http://www.dialog-21.ru/media/4338/toldovas.pdf
eLIBRARY: https://www.elibrary.ru/item.asp?id=35737656
Publications of the Higher School of Economics: https://publications.hse.ru/en/chapters/222860597
Toldova S., Pisarevskaya D., Kobozeva M., Vasilyeva M. The cues for rhetorical relations in Russian: “cause–effect” relation in Russian rhetorical structure treebank // Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue 2018”. – Moscow, May 30–June 2, 2018.