Feature engineering for depression detection in social media

Авторы

Смирнов И. В. Девяткин Д. А. Станкевич М. А.

Аннотация

This research is based on the CLEF/eRisk 2017 pilot task which is focused on early risk detection of depression. The CLEF/eRsik 2017 dataset consists of text examples collected from messages of 887 Reddit users. The main idea of the task is to classify users into two groups: risk case of depression and non-risk case. This paper considers different feature sets for depression detection task among Reddit users by text messages processing. We examine our bag-of-words, embedding and bigram models using the CLEF/eRisk 2017 dataset and evaluate the applicability of stylometric and morphological features. We also perform a comparison of our results with the CLEF/eRisk 2017 task report.

Внешние ссылки

DOI: http://dx.doi.org/10.5220/0006598604260431

PDF на Semantic Scholar (на англ.): https://api.semanticscholar.org/CorpusID:4776046

ResearchGate: https://www.researchgate.net/publication/322874168_Feature_Engineering_for_Depression_Detection_in_Social_Media

Ссылка при цитировании

Stankevich, M., Isakov, V., Devyatkin, D., Smirnov, I. Feature engineering for depression detection in social media (2018) ICPRAM 2018 - Proceedings of the 7th International Conference on Pattern Recognition Applications and Methods, 2018-January, pp. 426-431.