This paper describes a method that was implemented in the software submitted to PAN 2014 competition for the source retrieval task. For generating queries we use the most important noun phrases and words of sentences selected from a given suspicious document. To download documents that are likely to be sources of plagiarism we employ a sentence similarity measure.
Download PDF or read online at ResearchGate: https://www.researchgate.net/publication/330401166_PARAPHRASED_PLAGIARISM_DETECTION_USING_SENTENCE_SIMILARITY
Semantic Scholar: https://api.semanticscholar.org/CorpusID:15149652
Zubarev, D., Sochenkov, I. Using Sentence Similarity Measure for Plagiarism Source Retrieval — Notebook for PAN at CLEF 2014. In: CEUR Workshop Proceedings, CEUR-WS.org, Eds. L. Cappellato, N. Ferro, M. Halvey and W. Kraaij. 2014. P.p. 1027–1034, (дата обращения 22.09.2014)