84638 | ИПУ РАН

Автор(ы):

Бидерина К. К. (ИПУ РАН, Лаборатория 90)

Гребенков Д. И. (ИПУ РАН, Лаборатория 90)

Автор(ов):

Параметры публикации

Тип публикации:

Статья в журнале/сборнике

Название:

Seq2Seq approach and large language models for term extraction from russian scientific texts

ISBN/ISSN:

2072-9472

DOI:

10.18127/j20729472-202601-14

Наименование источника:

Highly Available Systems

Обозначение и номер тома:

V. 22. № 1

Город:

Москва

Издательство:

ООО Издательство Радиотехника

Год издания:

2026

Страницы:

71-75

Аннотация

Problem statement. Automatic term extraction from Russian-language scientific texts is a pressing problem in computational linguistics and information retrieval. The effectiveness of large language models without additional training compared to adapted architectures remains understudied, especially for the Russian language and specialized scientific corpora. Objective. The aim of this study is to investigate and compare two approaches to automatic term extraction from Russian-language scientific texts – a specialized neural network solution based on the T5 architecture, additionally trained for a sequence-to-sequence problem, and general-purpose large language models. Results. This study implemented a set of programs and models for extracting terms from abstracts and full texts of scientific publications based on the CL-RuTerm3 dataset. An additional experiment was conducted to evaluate large language models under few-shot training conditions. Practical significance. The developed specialized solution can be used for automatic and semi-automated term tagging in Russianlanguage scientific texts, as well as for creating and expanding terminological corpora. The results of the comparative analysis demonstrate the feasibility of using large language models as an auxiliary tool or baseline.

Библиографическая ссылка:

Бидерина К.К., Гребенков Д.И. Seq2Seq approach and large language models for term extraction from russian scientific texts // Highly Available Systems. 2026. V. 22. № 1. С. 71-75.