Online or cyber extremism is one of the critical problem for the security of Russia and other countries as social web is widely used for radical activity and propaganda. This paper considers the problem of extremist text detection in Russian social media. We propose models and methods for identification of extremist text in Russian, which apply deep linguistic parsing and statistical processing of texts. We also present the dataset of terrorist, religious hate, racism and other radical texts in Russian and results of experiments on this dataset. It was shown, that low-dimensional psycholinguistic and semantic features of texts allow detecting extremist texts with quite good performance while lexical features allow recognizing topics of the detected extremist texts.
DOI: http://dx.doi.org/10.33965/wbc2019_201908L041
PDF at the IADIS digital library: http://www.iadisportal.org/digital-library/mdownload/extremist-text-detection-in-social-web
Research Gate: https://www.researchgate.net/publication/335512137_EXTREMIST_TEXT_DETECTION_IN_SOCIAL_WEB
Higher School of Economics publications: https://publications.hse.ru/en/chapters/314143478
Semantic Scholar: https://api.semanticscholar.org/CorpusID:203060289
Devyatkin D., Smirnov I., Solovyev F., Suvorova M., Chepovskiy A. Extremist text detection in social web // Proceedings of the Multi Conference on Computer Science and Information Systems, MCCSIS 2019. Porto 2019, Pages 344-350.