Goal-oriented conversational agents are systems able converse with humans using natural language to help them reach a certain goal. The number of goals (or domains) about which an agent could converse is limited, and one of the issues is to identify whether a user talks about the unknown domain (in order to report a misunderstanding or switch to chit-chatting mode). We argue that this issue could be resolved if we consider it as an anomaly detection task which is in a field of machine learning. The scientific community developed a broad range of methods for resolving this task, and their applicability to the short text data was never investigated before. The aim of this work is to compare performance of 6 different anomaly detection methods on Russian and English short texts modeling conversational utterances, proposing the first evaluation framework for this task. As a result of the study, we find out that a simple threshold for cosine similarity works better than other methods for both of the considered languages.
DOI: https://doi.org/10.1007/978-3-030-02846-6_23
РИНЦ: https://elibrary.ru/item.asp?id=38618486
Издательство Кембриджского университета: https://www.cambridge.org/core/journals/natural-language-engineering/article/abs/unsupervised-modeling-anomaly-detection-in-discussion-forums-posts-using-global-vectors-for-text-representation/D48695A566706691800569E2D724F918
РУДН. Репозиторий: https://repository.rudn.ru/ru/records/article/record/36568/
Bakarov A., Yadrintsev V., Sochenkov I. Anomaly Detection for Short Texts: Identifying Whether Your Chatbot Should Switch from Goal-Oriented Conversation to Chit-Chatting // International Conference on Digital Transformation and Global Society. – Springer, Cham, 2018. – С. 289-298.