Два метода выявления русских заимствований в якутских текстах

В этой статье рассматриваются два метода выделения русскоязычные заимствований в якутских текстах. Под русскоязычным заимствованием понимаются лексические элементы, корни которых не адаптированы к якутской фонетике и пишутся как в исходном языке. Исходя из того, что большинство заимствований в якутс...

Full description

Bibliographic Details
Main Authors:	Cortegoso Vissio, Nicolas, Zakharov, Victor
Format:	Article in Journal/Newspaper
Language:	Russian
Published:	International Journal of Open Information Technologies 2022
Subjects:	Yakut
Online Access:	http://injoit.org/index.php/j1/article/view/1425

id	ftjinjoit:oai:ojs.injoit.org:article/1425
record_format	openpolar
spelling	ftjinjoit:oai:ojs.injoit.org:article/1425 2023-05-15T18:44:26+02:00 Два метода выявления русских заимствований в якутских текстах Two methods for identifying Russian words in Yakut texts Cortegoso Vissio, Nicolas Zakharov, Victor 2022-11-01 application/pdf http://injoit.org/index.php/j1/article/view/1425 rus rus International Journal of Open Information Technologies http://injoit.org/index.php/j1/article/view/1425/1324 http://injoit.org/index.php/j1/article/downloadSuppFile/1425/453 http://injoit.org/index.php/j1/article/downloadSuppFile/1425/454 http://injoit.org/index.php/j1/article/view/1425 Copyright (c) 2022 International Journal of Open Information Technologies International Journal of Open Information Technologies; Vol 10, No 11 (2022); 26-34 2307-8162 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion 2022 ftjinjoit 2023-03-14T17:58:22Z В этой статье рассматриваются два метода выделения русскоязычные заимствований в якутских текстах. Под русскоязычным заимствованием понимаются лексические элементы, корни которых не адаптированы к якутской фонетике и пишутся как в исходном языке. Исходя из того, что большинство заимствований в якутских текстах происходит из русского языка, предполагается, что они имеют определенную форму, по которой их можно отличить от якутских словоформ. Первый метод опирается на правилах. В нем реализован алгоритм, выявляющий сочетания букв, чуждые якутскому языку. Второй метод применяет статистический подход к моделированию сочетаний якутских и русских букв. Эффективность обоих методов извлечения заимствований сравнивается с результатами ручного выделения носителями русского языка в 6 якутских текстах. Данная работа является продолжением статьи [1]. The article discusses two methods for extracting foreign words from Yakut texts. Foreign words refer to non-integrated lexical units, which have not been adapted to Yakut orthography and are therefore written as in the original language. Based on the fact that most foreign words in Yakut texts come from the Russian language, it is assumed that they have a particular form by which they can be distinguished from the Yakut word forms. The first method reviewed here is based on rules. It implements an algorithm that detects letter combinations that are foreign to the Yakut language. The second method applies a statistical approach to model and differentiate Yakut and Russian letter combinations. The effectiveness of both methods in extracting Russian foreign words is compared with the results of manual highlighting performed by Russian speakers on 6 Yakut texts. This work is a continuation of the article “Identification of Russian borrowings in Yakut texts”, published in “Computer Linguistics and Computational Ontologies. Number 5 (Proceedings of the XXIV Joint International Conference "Internet and Modern Society, IMS-2022. Article in Journal/Newspaper Yakut International Journal of Open Information Technologies (INJOIT)
institution	Open Polar
collection	International Journal of Open Information Technologies (INJOIT)
op_collection_id	ftjinjoit
language	Russian
description	В этой статье рассматриваются два метода выделения русскоязычные заимствований в якутских текстах. Под русскоязычным заимствованием понимаются лексические элементы, корни которых не адаптированы к якутской фонетике и пишутся как в исходном языке. Исходя из того, что большинство заимствований в якутских текстах происходит из русского языка, предполагается, что они имеют определенную форму, по которой их можно отличить от якутских словоформ. Первый метод опирается на правилах. В нем реализован алгоритм, выявляющий сочетания букв, чуждые якутскому языку. Второй метод применяет статистический подход к моделированию сочетаний якутских и русских букв. Эффективность обоих методов извлечения заимствований сравнивается с результатами ручного выделения носителями русского языка в 6 якутских текстах. Данная работа является продолжением статьи [1]. The article discusses two methods for extracting foreign words from Yakut texts. Foreign words refer to non-integrated lexical units, which have not been adapted to Yakut orthography and are therefore written as in the original language. Based on the fact that most foreign words in Yakut texts come from the Russian language, it is assumed that they have a particular form by which they can be distinguished from the Yakut word forms. The first method reviewed here is based on rules. It implements an algorithm that detects letter combinations that are foreign to the Yakut language. The second method applies a statistical approach to model and differentiate Yakut and Russian letter combinations. The effectiveness of both methods in extracting Russian foreign words is compared with the results of manual highlighting performed by Russian speakers on 6 Yakut texts. This work is a continuation of the article “Identification of Russian borrowings in Yakut texts”, published in “Computer Linguistics and Computational Ontologies. Number 5 (Proceedings of the XXIV Joint International Conference "Internet and Modern Society, IMS-2022.
format	Article in Journal/Newspaper
author	Cortegoso Vissio, Nicolas Zakharov, Victor
spellingShingle	Cortegoso Vissio, Nicolas Zakharov, Victor Два метода выявления русских заимствований в якутских текстах
author_facet	Cortegoso Vissio, Nicolas Zakharov, Victor
author_sort	Cortegoso Vissio, Nicolas
title	Два метода выявления русских заимствований в якутских текстах
title_short	Два метода выявления русских заимствований в якутских текстах
title_full	Два метода выявления русских заимствований в якутских текстах
title_fullStr	Два метода выявления русских заимствований в якутских текстах
title_full_unstemmed	Два метода выявления русских заимствований в якутских текстах
title_sort	два метода выявления русских заимствований в якутских текстах
publisher	International Journal of Open Information Technologies
publishDate	2022
url	http://injoit.org/index.php/j1/article/view/1425
genre	Yakut
genre_facet	Yakut
op_source	International Journal of Open Information Technologies; Vol 10, No 11 (2022); 26-34 2307-8162
op_relation	http://injoit.org/index.php/j1/article/view/1425/1324 http://injoit.org/index.php/j1/article/downloadSuppFile/1425/453 http://injoit.org/index.php/j1/article/downloadSuppFile/1425/454 http://injoit.org/index.php/j1/article/view/1425
op_rights	Copyright (c) 2022 International Journal of Open Information Technologies
_version_	1766235112623570944

Два метода выявления русских заимствований в якутских текстах

Similar Items