82391

Автор(ы): 

Автор(ов): 

5

Параметры публикации

Тип публикации: 

Статья в журнале/сборнике

Название: 

RuGECToR: Rule-Based Neural Network Model for Russian Language Grammatical Error Correction

ISBN/ISSN: 

0361-7688

DOI: 

10.1134/S0361768824700129

Наименование источника: 

  • Programming and computer software

Обозначение и номер тома: 

Т. 50, № 4

Город: 

  • New York

Издательство: 

  • Pleiades Publishing Ltd

Год издания: 

2024

Страницы: 

315-321
Аннотация
Grammatical error correction is one of the core natural language processing tasks. Presently, the open-source state-of-the-art sequence tagging for English is the GECToR model. For Russian, this problem does not have equally effective solutions due to the lack of annotated datasets, which motivated the current research. In this paper, we describe the process of creating a synthetic dataset and training the model on it. The GECToR architecture is adapted for the Russian language, and it is called RuGECToR. This architecture is chosen because, unlike the sequence-to-sequence approach, it is easy to interpret and does not require a lot of training data. The aim is to train the model in such a way that it generalizes the morphological properties of the language rather than adapts to a specific training sample. The presented model achieves the quality of 82.5 in the metric on synthetic data and 22.2 on the RULEC dataset, which was not used at the training stage.

Библиографическая ссылка: 

Хабутдинов И.А., Чащин А.В., Грабовой А.В., Кильдяков А.С., Чехович Ю.В. RuGECToR: Rule-Based Neural Network Model for Russian Language Grammatical Error Correction // Programming and computer software. 2024. Т. 50, № 4. С. 315-321.

Публикация имеет версию на другом языке или вышла в другом издании, например, в электронной (или онлайн) версии журнала: 

Да

Связь с публикацией: