FST Morphology for the Endangered Skolt Sami Language

We present advances in the development of a FST-based morphological analyzer and generator for Skolt Sami. Like other minority Uralic languages, Skolt Sami exhibits a rich morphology, on the one hand, and there is little golden standard material for it, on the other. This makes NLP approaches for it...

Full description

Bibliographic Details
Main Authors: Rueter, Jack, Hämäläinen, Mika
Format: Text
Language:unknown
Published: 2020
Subjects:
Online Access:http://arxiv.org/abs/2004.04803
id ftarxivpreprints:oai:arXiv.org:2004.04803
record_format openpolar
spelling ftarxivpreprints:oai:arXiv.org:2004.04803 2023-09-05T13:22:53+02:00 FST Morphology for the Endangered Skolt Sami Language Rueter, Jack Hämäläinen, Mika 2020-04-09 http://arxiv.org/abs/2004.04803 unknown http://arxiv.org/abs/2004.04803 Computer Science - Computation and Language Computer Science - Formal Languages and Automata Theory text 2020 ftarxivpreprints 2023-08-16T15:49:28Z We present advances in the development of a FST-based morphological analyzer and generator for Skolt Sami. Like other minority Uralic languages, Skolt Sami exhibits a rich morphology, on the one hand, and there is little golden standard material for it, on the other. This makes NLP approaches for its study difficult without a solid morphological analysis. The language is severely endangered and the work presented in this paper forms a part of a greater whole in its revitalization efforts. Furthermore, we intersperse our description with facilitation and description practices not well documented in the infrastructure. Currently, the analyzer covers over 30,000 Skolt Sami words in 148 inflectional paradigms and over 12 derivational forms. Comment: Accepted to The 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020) Text sami ArXiv.org (Cornell University Library)
institution Open Polar
collection ArXiv.org (Cornell University Library)
op_collection_id ftarxivpreprints
language unknown
topic Computer Science - Computation and Language
Computer Science - Formal Languages and Automata Theory
spellingShingle Computer Science - Computation and Language
Computer Science - Formal Languages and Automata Theory
Rueter, Jack
Hämäläinen, Mika
FST Morphology for the Endangered Skolt Sami Language
topic_facet Computer Science - Computation and Language
Computer Science - Formal Languages and Automata Theory
description We present advances in the development of a FST-based morphological analyzer and generator for Skolt Sami. Like other minority Uralic languages, Skolt Sami exhibits a rich morphology, on the one hand, and there is little golden standard material for it, on the other. This makes NLP approaches for its study difficult without a solid morphological analysis. The language is severely endangered and the work presented in this paper forms a part of a greater whole in its revitalization efforts. Furthermore, we intersperse our description with facilitation and description practices not well documented in the infrastructure. Currently, the analyzer covers over 30,000 Skolt Sami words in 148 inflectional paradigms and over 12 derivational forms. Comment: Accepted to The 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020)
format Text
author Rueter, Jack
Hämäläinen, Mika
author_facet Rueter, Jack
Hämäläinen, Mika
author_sort Rueter, Jack
title FST Morphology for the Endangered Skolt Sami Language
title_short FST Morphology for the Endangered Skolt Sami Language
title_full FST Morphology for the Endangered Skolt Sami Language
title_fullStr FST Morphology for the Endangered Skolt Sami Language
title_full_unstemmed FST Morphology for the Endangered Skolt Sami Language
title_sort fst morphology for the endangered skolt sami language
publishDate 2020
url http://arxiv.org/abs/2004.04803
genre sami
genre_facet sami
op_relation http://arxiv.org/abs/2004.04803
_version_ 1776203453411885056