FST Morphology for the Endangered Skolt Sami Language
We present advances in the development of a FST-based morphological analyzer and generator for Skolt Sami. Like other minority Uralic languages, Skolt Sami exhibits a rich morphology, on the one hand, and there is little golden standard material for it, on the other. This makes NLP approaches for it...
Main Authors: | , |
---|---|
Format: | Text |
Language: | unknown |
Published: |
2020
|
Subjects: | |
Online Access: | http://arxiv.org/abs/2004.04803 |
id |
ftarxivpreprints:oai:arXiv.org:2004.04803 |
---|---|
record_format |
openpolar |
spelling |
ftarxivpreprints:oai:arXiv.org:2004.04803 2023-09-05T13:22:53+02:00 FST Morphology for the Endangered Skolt Sami Language Rueter, Jack Hämäläinen, Mika 2020-04-09 http://arxiv.org/abs/2004.04803 unknown http://arxiv.org/abs/2004.04803 Computer Science - Computation and Language Computer Science - Formal Languages and Automata Theory text 2020 ftarxivpreprints 2023-08-16T15:49:28Z We present advances in the development of a FST-based morphological analyzer and generator for Skolt Sami. Like other minority Uralic languages, Skolt Sami exhibits a rich morphology, on the one hand, and there is little golden standard material for it, on the other. This makes NLP approaches for its study difficult without a solid morphological analysis. The language is severely endangered and the work presented in this paper forms a part of a greater whole in its revitalization efforts. Furthermore, we intersperse our description with facilitation and description practices not well documented in the infrastructure. Currently, the analyzer covers over 30,000 Skolt Sami words in 148 inflectional paradigms and over 12 derivational forms. Comment: Accepted to The 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020) Text sami ArXiv.org (Cornell University Library) |
institution |
Open Polar |
collection |
ArXiv.org (Cornell University Library) |
op_collection_id |
ftarxivpreprints |
language |
unknown |
topic |
Computer Science - Computation and Language Computer Science - Formal Languages and Automata Theory |
spellingShingle |
Computer Science - Computation and Language Computer Science - Formal Languages and Automata Theory Rueter, Jack Hämäläinen, Mika FST Morphology for the Endangered Skolt Sami Language |
topic_facet |
Computer Science - Computation and Language Computer Science - Formal Languages and Automata Theory |
description |
We present advances in the development of a FST-based morphological analyzer and generator for Skolt Sami. Like other minority Uralic languages, Skolt Sami exhibits a rich morphology, on the one hand, and there is little golden standard material for it, on the other. This makes NLP approaches for its study difficult without a solid morphological analysis. The language is severely endangered and the work presented in this paper forms a part of a greater whole in its revitalization efforts. Furthermore, we intersperse our description with facilitation and description practices not well documented in the infrastructure. Currently, the analyzer covers over 30,000 Skolt Sami words in 148 inflectional paradigms and over 12 derivational forms. Comment: Accepted to The 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020) |
format |
Text |
author |
Rueter, Jack Hämäläinen, Mika |
author_facet |
Rueter, Jack Hämäläinen, Mika |
author_sort |
Rueter, Jack |
title |
FST Morphology for the Endangered Skolt Sami Language |
title_short |
FST Morphology for the Endangered Skolt Sami Language |
title_full |
FST Morphology for the Endangered Skolt Sami Language |
title_fullStr |
FST Morphology for the Endangered Skolt Sami Language |
title_full_unstemmed |
FST Morphology for the Endangered Skolt Sami Language |
title_sort |
fst morphology for the endangered skolt sami language |
publishDate |
2020 |
url |
http://arxiv.org/abs/2004.04803 |
genre |
sami |
genre_facet |
sami |
op_relation |
http://arxiv.org/abs/2004.04803 |
_version_ |
1776203453411885056 |