FinUgRevita: Developing Language Technology Tools for Udmurt and Mansi

Nowadays, digital language use such as reading and writing e-mails, chats, messages, weblogs and comments on websites and social media platforms such as Facebook and Twitter has increased the amount of written language production for most of the users. Thus, it is primarily important for speakers of...

Full description

Bibliographic Details
Published in:Septentrio Conference Series
Main Authors: Vincze, Veronika, Nagy, Ágoston, Horváth, Csilla, Szilágyi, Norbert, Kozmács, István, Bogár, Edit, Fenyvesi, Anna
Format: Article in Journal/Newspaper
Language:English
Published: Septentrio Academic Publishing 2015
Subjects:
Online Access:https://septentrio.uit.no/index.php/SCS/article/view/3473
https://doi.org/10.7557/5.3473
Description
Summary:Nowadays, digital language use such as reading and writing e-mails, chats, messages, weblogs and comments on websites and social media platforms such as Facebook and Twitter has increased the amount of written language production for most of the users. Thus, it is primarily important for speakers of minority languages to have the possibility of using their own languages in the digital world too. The FinUgRevita project aims at providing computational language tools for endangered indigenous Finno-Ugric languages in Russia, assisting the speakers of these languages in using the indigenous languages in the digital space. Currently, we are working on two Finno-Ugric minority languages, namely, Udmurt and Mansi. In the project, we have been developing electronic dictionaries for both languages, besides, we have been creating corpora with a substantial number of texts collected, among other sources like literature, newspaper articles and social media. We have been also implementing morphological analyzers for both languages, exploiting the lexical entries of our dictionaries. We believe that the results achieved by the FinUgRevita project will contribute to the revitalization of Udmurt and Mansi and the tools to be developed will help these languages establish their existence in the digital space as well.