Development of a body of texts in Russian with markings based on the theory of rhetorical structures


Kobozeva M. Suvorova (Ananieva) M.


This paper presents an adaptation of the Rhetorical Structure Theory to the Russian language and the development of an RST-corpus that will be used for training of an automatic discourse parser in the future. Authors’ survey shows that discourse analysis improves performance of systems for machine translation, automatic summarization, author identification etc. At the time of writing, ten texts from the SynTagRus-treebank had been annotated. A list of discourse relations (proposed by W. Mann and S. Thompson) has been modified and a list of Russian discourse markers has been made. Besides, authors present some preliminary discourse-structure statistics on the basis of this annotation.

M. V., Kobozeva M. I. Ananyeva. Development of corpus in Russian with a marking on the basis of the theory of rhetorical structures // Works of the International Dialogue conference. Student's session. - 2016