82484 | ИПУ РАН

Автор(ы):

Автор(ов):

Параметры публикации

Тип публикации:

Глава в книге

Название:

Cross-Language Plagiarism Detection: A Case Study of European Languages Academic Works

Сведения об издании:

1-е издание

ISBN/ISSN:

978-3-031-16976-2

DOI:

10.1007/978-3-031-16976-2_9

Наименование источника:

Academic Integrity: Broadening Practices, Technologies, and the Role of Students. Proceedings from the European Conference on Academic Integrity and Plagiarism 2021

Город:

Cham

Издательство:

Springer

Год издания:

2022

Страницы:

143-161

Аннотация

The chapter investigates the problem of cross-lingual plagiarism in academic works of European universities. Although the possibly massive problem of incorrect text reuse, most text reuse detection systems generally focus only on the monolingual plagiarism text reuse: when both the analysed document and source of text reuse are written in one language. In this chapter, we analyse a more difficult setting: when the languages of the analysed document and reused language are different. For this problem solution, we present a system of cross-lingual text reuse detection. The system composes the methods of statistical machine translation and deep learning methods based on the contextualized word embeddings, such as BERT and its multilingual version, LaBSE. To analyse the efficiency of the proposed method, we conduct experiments both on the synthetic dataset generated using machine translation systems and on the real dataset of academic graduation theses. We experimented on the collection of 10202 documents and found 103 documents with a significant amount of cross-lingual text reuse. Although these results are preliminary and should be verified further, they confirm the massiveness of this problem in academic science.

Библиографическая ссылка:

Бахтеев О.Ю., Чехович Ю.В., Грабовой А.В., Горбачев Г.В., Горленко Т.А., Гращенков К.В., Ивахненко А.А., Кильдяков А.С., Хазов А.В., Комарницкий В.Е., Никитов А.В., Огальцов А.В., Сахарова А.В. Cross-Language Plagiarism Detection: A Case Study of European Languages Academic Works / Academic Integrity: Broadening Practices, Technologies, and the Role of Students. Proceedings from the European Conference on Academic Integrity and Plagiarism 2021. Cham: Springer, 2022. С. 143-161.