Problem statement. Automatic term extraction from Russian-language scientific texts is a pressing problem in computational linguistics and information retrieval. The effectiveness of large language models without additional training compared to adapted architectures remains understudied, especially for the Russian language and specialized scientific corpora.
Objective. The aim of this study is to investigate and compare two approaches to automatic term extraction from Russian-language
scientific texts – a specialized neural network solution based on the T5 architecture, additionally trained for a sequence-to-sequence
problem, and general-purpose large language models.
Results. This study implemented a set of programs and models for extracting terms from abstracts and full texts of scientific publications based on the CL-RuTerm3 dataset. An additional experiment was conducted to evaluate large language models under few-shot
training conditions.
Practical significance. The developed specialized solution can be used for automatic and semi-automated term tagging in Russianlanguage scientific texts, as well as for creating and expanding terminological corpora. The results of the comparative analysis
demonstrate the feasibility of using large language models as an auxiliary tool or baseline.