Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus

In this paper, we report on the main lessons drawn from the first year of a Tundra Nenets (Samoyedic, Uralic) corpus building work carried out in the Hungarian Research Institute for Linguistics. The aim of our work is twofold. First we collect, process and archive written (and in the latter part of...

Full description

Bibliographic Details
Published in:Proceedings of the Workshop on Computational Methods for Endangered Languages
Main Authors: Mus, Nikolett, Metzger, Réka
Format: Article in Journal/Newspaper
Language:English
Published: Proceedings of the Workshop on Computational Methods for Endangered Languages 2021
Subjects:
Online Access:https://journals.colorado.edu/index.php/computel/article/view/975
https://doi.org/10.33011/computel.v2i.975
id ftucoloradobould:oai:journals.colorado.edu:article/975
record_format openpolar
spelling ftucoloradobould:oai:journals.colorado.edu:article/975 2023-05-15T17:14:24+02:00 Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus Mus, Nikolett Metzger, Réka 2021-03-02 application/pdf https://journals.colorado.edu/index.php/computel/article/view/975 https://doi.org/10.33011/computel.v2i.975 eng eng Proceedings of the Workshop on Computational Methods for Endangered Languages https://journals.colorado.edu/index.php/computel/article/view/975/901 https://journals.colorado.edu/index.php/computel/article/view/975 doi:10.33011/computel.v2i.975 Proceedings of the Workshop on Computational Methods for Endangered Languages; Vol. 2 (2021): Proceedings of the 4th Workshop on Computational Methods for Endangered Languages (Resource Papers and Extended Abstracts); 4-9 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Extended abstract and resource paper 2021 ftucoloradobould https://doi.org/10.33011/computel.v2i.975 2022-10-18T09:18:49Z In this paper, we report on the main lessons drawn from the first year of a Tundra Nenets (Samoyedic, Uralic) corpus building work carried out in the Hungarian Research Institute for Linguistics. The aim of our work is twofold. First we collect, process and archive written (and in the latter part of the project period spoken) data of Tundra Nenets. Second, we build a parallel corpus, i.e. a Tundra Nenets–Russian corpus, to support and encourage preferably synchronic syntactic research on Tundra Nenets. After discussing certain language and culture specific factors that potentially influence the sampling method, we present the stages of our work in detail. Article in Journal/Newspaper nenets samoyed* Tundra University of Colorado Boulder Open Journals Proceedings of the Workshop on Computational Methods for Endangered Languages 2 2
institution Open Polar
collection University of Colorado Boulder Open Journals
op_collection_id ftucoloradobould
language English
description In this paper, we report on the main lessons drawn from the first year of a Tundra Nenets (Samoyedic, Uralic) corpus building work carried out in the Hungarian Research Institute for Linguistics. The aim of our work is twofold. First we collect, process and archive written (and in the latter part of the project period spoken) data of Tundra Nenets. Second, we build a parallel corpus, i.e. a Tundra Nenets–Russian corpus, to support and encourage preferably synchronic syntactic research on Tundra Nenets. After discussing certain language and culture specific factors that potentially influence the sampling method, we present the stages of our work in detail.
format Article in Journal/Newspaper
author Mus, Nikolett
Metzger, Réka
spellingShingle Mus, Nikolett
Metzger, Réka
Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus
author_facet Mus, Nikolett
Metzger, Réka
author_sort Mus, Nikolett
title Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus
title_short Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus
title_full Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus
title_fullStr Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus
title_full_unstemmed Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus
title_sort toward a corpus of tundra nenets: stages and challenges in building a corpus
publisher Proceedings of the Workshop on Computational Methods for Endangered Languages
publishDate 2021
url https://journals.colorado.edu/index.php/computel/article/view/975
https://doi.org/10.33011/computel.v2i.975
genre nenets
samoyed*
Tundra
genre_facet nenets
samoyed*
Tundra
op_source Proceedings of the Workshop on Computational Methods for Endangered Languages; Vol. 2 (2021): Proceedings of the 4th Workshop on Computational Methods for Endangered Languages (Resource Papers and Extended Abstracts); 4-9
op_relation https://journals.colorado.edu/index.php/computel/article/view/975/901
https://journals.colorado.edu/index.php/computel/article/view/975
doi:10.33011/computel.v2i.975
op_doi https://doi.org/10.33011/computel.v2i.975
container_title Proceedings of the Workshop on Computational Methods for Endangered Languages
container_volume 2
container_issue 2
_version_ 1766071787201757184