82522 | ИПУ РАН

Автор(ы):

Автор(ов):

Параметры публикации

Тип публикации:

Доклад

Название:

N-Gram Perplexity-Based AI-Generated Text Detection

Электронная публикация:

Да

ISBN/ISSN:

2767-9535

DOI:

10.1109/ispras64596.2024.10899150

Наименование конференции:

2024 Ivannikov Ispras Open Conference (ISPRAS)

Наименование источника:

Proceedings of the Ivannikov Memorial Workshop (IVMEM), 2024

Город:

Москва

Издательство:

IEEE

Год издания:

2024

Страницы:

https://ieeexplore.ieee.org/abstract/document/10899150

Аннотация

Currently, more efforts are being made to improve the capabilities of Large Language Models than to address their implications. Modern language models are capable of generating texts that appear indistinguishable from those written by human experts. While providing a high quality of life, such breakthroughs at the same time pose new challenges in education, science and social media. In addition, existing approaches to detect texts created by artificial intelligence either require high computational cost or access to the internal computation of LLMs, which in turn hinders their public availability. Based on these considerations, this paper presents a new paradigm for detecting texts created by artificial intelligence based on on collecting preliminary token statistics and computing n-gram perplexity features. On the combination of HC3, M4GT and MAGE datasets it shows a speedup of 2x over existing approaches with a quality drop around 5%. Moreover, the combination of methods achieves the best quality. This strikes a balance between computational cost, accessibility and performance.

Библиографическая ссылка:

Пойманов Д.Р., Местецкий Л.М., Грабовой А.В. N-Gram Perplexity-Based AI-Generated Text Detection / Proceedings of the Ivannikov Memorial Workshop (IVMEM), 2024. М.: IEEE, 2024. С. https://ieeexplore.ieee.org/abstract/document/10899150.