Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus
In this paper, we report on the main lessons drawn from the first year of a Tundra Nenets (Samoyedic, Uralic) corpus building work carried out in the Hungarian Research Institute for Linguistics. The aim of our work is twofold. First we collect, process and archive written (and in the latter part of...
Published in: | Proceedings of the Workshop on Computational Methods for Endangered Languages |
---|---|
Main Authors: | , |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Proceedings of the Workshop on Computational Methods for Endangered Languages
2021
|
Subjects: | |
Online Access: | https://journals.colorado.edu/index.php/computel/article/view/975 https://doi.org/10.33011/computel.v2i.975 |
id |
ftucoloradobould:oai:journals.colorado.edu:article/975 |
---|---|
record_format |
openpolar |
spelling |
ftucoloradobould:oai:journals.colorado.edu:article/975 2023-05-15T17:14:24+02:00 Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus Mus, Nikolett Metzger, Réka 2021-03-02 application/pdf https://journals.colorado.edu/index.php/computel/article/view/975 https://doi.org/10.33011/computel.v2i.975 eng eng Proceedings of the Workshop on Computational Methods for Endangered Languages https://journals.colorado.edu/index.php/computel/article/view/975/901 https://journals.colorado.edu/index.php/computel/article/view/975 doi:10.33011/computel.v2i.975 Proceedings of the Workshop on Computational Methods for Endangered Languages; Vol. 2 (2021): Proceedings of the 4th Workshop on Computational Methods for Endangered Languages (Resource Papers and Extended Abstracts); 4-9 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Extended abstract and resource paper 2021 ftucoloradobould https://doi.org/10.33011/computel.v2i.975 2022-10-18T09:18:49Z In this paper, we report on the main lessons drawn from the first year of a Tundra Nenets (Samoyedic, Uralic) corpus building work carried out in the Hungarian Research Institute for Linguistics. The aim of our work is twofold. First we collect, process and archive written (and in the latter part of the project period spoken) data of Tundra Nenets. Second, we build a parallel corpus, i.e. a Tundra Nenets–Russian corpus, to support and encourage preferably synchronic syntactic research on Tundra Nenets. After discussing certain language and culture specific factors that potentially influence the sampling method, we present the stages of our work in detail. Article in Journal/Newspaper nenets samoyed* Tundra University of Colorado Boulder Open Journals Proceedings of the Workshop on Computational Methods for Endangered Languages 2 2 |
institution |
Open Polar |
collection |
University of Colorado Boulder Open Journals |
op_collection_id |
ftucoloradobould |
language |
English |
description |
In this paper, we report on the main lessons drawn from the first year of a Tundra Nenets (Samoyedic, Uralic) corpus building work carried out in the Hungarian Research Institute for Linguistics. The aim of our work is twofold. First we collect, process and archive written (and in the latter part of the project period spoken) data of Tundra Nenets. Second, we build a parallel corpus, i.e. a Tundra Nenets–Russian corpus, to support and encourage preferably synchronic syntactic research on Tundra Nenets. After discussing certain language and culture specific factors that potentially influence the sampling method, we present the stages of our work in detail. |
format |
Article in Journal/Newspaper |
author |
Mus, Nikolett Metzger, Réka |
spellingShingle |
Mus, Nikolett Metzger, Réka Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus |
author_facet |
Mus, Nikolett Metzger, Réka |
author_sort |
Mus, Nikolett |
title |
Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus |
title_short |
Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus |
title_full |
Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus |
title_fullStr |
Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus |
title_full_unstemmed |
Toward a Corpus of Tundra Nenets: Stages and Challenges in Building a Corpus |
title_sort |
toward a corpus of tundra nenets: stages and challenges in building a corpus |
publisher |
Proceedings of the Workshop on Computational Methods for Endangered Languages |
publishDate |
2021 |
url |
https://journals.colorado.edu/index.php/computel/article/view/975 https://doi.org/10.33011/computel.v2i.975 |
genre |
nenets samoyed* Tundra |
genre_facet |
nenets samoyed* Tundra |
op_source |
Proceedings of the Workshop on Computational Methods for Endangered Languages; Vol. 2 (2021): Proceedings of the 4th Workshop on Computational Methods for Endangered Languages (Resource Papers and Extended Abstracts); 4-9 |
op_relation |
https://journals.colorado.edu/index.php/computel/article/view/975/901 https://journals.colorado.edu/index.php/computel/article/view/975 doi:10.33011/computel.v2i.975 |
op_doi |
https://doi.org/10.33011/computel.v2i.975 |
container_title |
Proceedings of the Workshop on Computational Methods for Endangered Languages |
container_volume |
2 |
container_issue |
2 |
_version_ |
1766071787201757184 |