The paper presents a method for pornography detection in the web pages based on natural language processing. The described classification method uses feature set of single words and groups of words. Syntax analysis is performed to extract collocations. A modification of TF-IDF is used to weight terms. An evaluation and comparison of quality and performance of classification are given.
DOI: http://dx.doi.org/10.1007/978-3-319-01931-4_31
РИНЦ: https://elibrary.ru/item.asp?id=20455489
Читать на ResearchGate (на англ.): https://www.researchgate.net/publication/290622382_Method_for_Pornography_Filtering_in_the_WEB_Based_on_Automatic_Classification_and_Natural_Language_Processing
Roman Suvorov, Ilya Sochenkov, Ilya Tikhomirov. Method for Pornography Filtering in the WEB Based on Automatic Classification and Natural Language Processing // in Proceedings of 15th International Conference, SPECOM 2013. Ed. Miloš Železný, Ivan Habernal, Andrey Ronzhin. Pilsen, Czech Republic, 2013, pp 233-240. ISBN 978-3-319-01930-7