Neural models for morphological generation, analysis and lemmatization in 22 languages

Morphological models for generation, lemmatization and analysis in 22 languages. The models are trained in OpenNMT-py https://github.com/OpenNMT/OpenNMT-py. Feed one word at a time, split into characters (kissa -> k i s s a) Supported languages: German (deu), Kven (fkv), Komi-Zyrian (kpv), Mokhsa...

Full description

Bibliographic Details
Main Authors: Hämäläinen, Mika, Partanen, Niko, Rueter, Jack, Alnajjar, Khalid
Format: Dataset
Language:Finnish
Published: Zenodo 2020
Subjects:
fst
Online Access:https://dx.doi.org/10.5281/zenodo.3926768
https://zenodo.org/record/3926768
id ftdatacite:10.5281/zenodo.3926768
record_format openpolar
spelling ftdatacite:10.5281/zenodo.3926768 2023-05-15T17:01:32+02:00 Neural models for morphological generation, analysis and lemmatization in 22 languages Hämäläinen, Mika Partanen, Niko Rueter, Jack Alnajjar, Khalid 2020 https://dx.doi.org/10.5281/zenodo.3926768 https://zenodo.org/record/3926768 fi fin Zenodo https://dx.doi.org/10.5281/zenodo.3926769 Open Access Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 info:eu-repo/semantics/openAccess CC-BY morphology fst endangered languages neural models dataset Dataset 2020 ftdatacite https://doi.org/10.5281/zenodo.3926768 https://doi.org/10.5281/zenodo.3926769 2021-11-05T12:55:41Z Morphological models for generation, lemmatization and analysis in 22 languages. The models are trained in OpenNMT-py https://github.com/OpenNMT/OpenNMT-py. Feed one word at a time, split into characters (kissa -> k i s s a) Supported languages: German (deu), Kven (fkv), Komi-Zyrian (kpv), Mokhsa (mdf), Mansi (mns), Erzya (myv), Norwegian Bokmål (nob), Russian (rus), South Sami (sma), Lule Sami (smj), Skolt Sami (sms), Võro (vro), Finnish (fin), Komi-Permyak (koi), Latvian (lav), Eastern Mari (mhr), Western Mari (mrj), Namonuito (nmt), Olonets-Karelian (olo), Pite Sami (sje), Northern Sami (sme), Inari Sami (smn) and Udmurt (udm) Cite: Hämäläinen, M., Partanen, N., Rueter, J., & Alnajjar, K. (2021). Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021) Dataset karelian sami Mansi DataCite Metadata Store (German National Library of Science and Technology) Rus’ ENVELOPE(155.950,155.950,54.200,54.200) Inari ENVELOPE(27.029,27.029,68.906,68.906) Hämäläinen ENVELOPE(26.200,26.200,66.883,66.883)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language Finnish
topic morphology
fst
endangered languages
neural models
spellingShingle morphology
fst
endangered languages
neural models
Hämäläinen, Mika
Partanen, Niko
Rueter, Jack
Alnajjar, Khalid
Neural models for morphological generation, analysis and lemmatization in 22 languages
topic_facet morphology
fst
endangered languages
neural models
description Morphological models for generation, lemmatization and analysis in 22 languages. The models are trained in OpenNMT-py https://github.com/OpenNMT/OpenNMT-py. Feed one word at a time, split into characters (kissa -> k i s s a) Supported languages: German (deu), Kven (fkv), Komi-Zyrian (kpv), Mokhsa (mdf), Mansi (mns), Erzya (myv), Norwegian Bokmål (nob), Russian (rus), South Sami (sma), Lule Sami (smj), Skolt Sami (sms), Võro (vro), Finnish (fin), Komi-Permyak (koi), Latvian (lav), Eastern Mari (mhr), Western Mari (mrj), Namonuito (nmt), Olonets-Karelian (olo), Pite Sami (sje), Northern Sami (sme), Inari Sami (smn) and Udmurt (udm) Cite: Hämäläinen, M., Partanen, N., Rueter, J., & Alnajjar, K. (2021). Neural Morphology Dataset and Models for Multiple Languages, from the Large to the Endangered. In Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa 2021)
format Dataset
author Hämäläinen, Mika
Partanen, Niko
Rueter, Jack
Alnajjar, Khalid
author_facet Hämäläinen, Mika
Partanen, Niko
Rueter, Jack
Alnajjar, Khalid
author_sort Hämäläinen, Mika
title Neural models for morphological generation, analysis and lemmatization in 22 languages
title_short Neural models for morphological generation, analysis and lemmatization in 22 languages
title_full Neural models for morphological generation, analysis and lemmatization in 22 languages
title_fullStr Neural models for morphological generation, analysis and lemmatization in 22 languages
title_full_unstemmed Neural models for morphological generation, analysis and lemmatization in 22 languages
title_sort neural models for morphological generation, analysis and lemmatization in 22 languages
publisher Zenodo
publishDate 2020
url https://dx.doi.org/10.5281/zenodo.3926768
https://zenodo.org/record/3926768
long_lat ENVELOPE(155.950,155.950,54.200,54.200)
ENVELOPE(27.029,27.029,68.906,68.906)
ENVELOPE(26.200,26.200,66.883,66.883)
geographic Rus’
Inari
Hämäläinen
geographic_facet Rus’
Inari
Hämäläinen
genre karelian
sami
Mansi
genre_facet karelian
sami
Mansi
op_relation https://dx.doi.org/10.5281/zenodo.3926769
op_rights Open Access
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
cc-by-4.0
info:eu-repo/semantics/openAccess
op_rightsnorm CC-BY
op_doi https://doi.org/10.5281/zenodo.3926768
https://doi.org/10.5281/zenodo.3926769
_version_ 1766054638543437824