INEL Nganasan Corpus

Corpus Citation Brykina, Maria; Gusev, Valentin; Szeverényi, Sándor; Wagner-Nagy, Beáta. INEL Nganasan Corpus. Version 1.0. Publication date 2025-05-02. https://hdl.handle.net/11022/0000-0007-FE63-C. Archived at Universität Hamburg. In: The INEL corpora of indigenous Northern Eurasian languages. htt...

Full description

Bibliographic Details
Main Authors: Brykina, Maria, Gusev, Valentin, Szeverényi, Sándor, Wagner-Nagy, Beáta
Other Authors: Lazarenko, Elena, Riaposov, Aleksandr, Lehmberg, Timm, Arkhipov, Alexandre
Format: Dataset
Language:unknown
Published: 2025
Subjects:
Online Access:https://www.fdr.uni-hamburg.de/record/17419
https://doi.org/10.25592/uhhfdm.17419
_version_ 1835017605484642304
author Brykina, Maria
Gusev, Valentin
Szeverényi, Sándor
Wagner-Nagy, Beáta
author2 Lazarenko, Elena
Riaposov, Aleksandr
Lehmberg, Timm
Wagner-Nagy, Beáta
Arkhipov, Alexandre
author_facet Brykina, Maria
Gusev, Valentin
Szeverényi, Sándor
Wagner-Nagy, Beáta
author_sort Brykina, Maria
collection Unknown
description Corpus Citation Brykina, Maria; Gusev, Valentin; Szeverényi, Sándor; Wagner-Nagy, Beáta. INEL Nganasan Corpus. Version 1.0. Publication date 2025-05-02. https://hdl.handle.net/11022/0000-0007-FE63-C. Archived at Universität Hamburg. In: The INEL corpora of indigenous Northern Eurasian languages. https://hdl.handle.net/11022/0000-0007-F45A-1 Corpus Description The INEL Nganasan corpus has been created within the long-term INEL project ("Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages"), 2016–2033. The corpus is largely based on the Nganasan Spoken Language Corpus, which has been adapted to the INEL standards and supplemented with new texts. The corpus makes possible typologically oriented corpus-based research on Nganasan and expands the documentation of the lesser described indigenous languages of Northern Eurasia. The INEL Nganasan corpus consists of two parts. The glossed (searchable) part of the corpus includes texts provided with source media files (whenever available) and annotated transcripts. The archival part of the corpus contains non-glossed texts, represented either by audio recordings (optionally – with preliminary transcriptions) or scanned pages of the manuscripts or publications. The corpus includes texts recorded between 1933–2019 in Nganasan. The sources of the corpus are: Audio recordings done by Maria Brykina, Valentin Gusev, Sándor Szeverényi and Beáta Wagner-Nagy. Legacy audio recordings done by A. Aksyonova, Svetlana S. Aksyonova, Josefina Budzisch, Michael Daniel, Oksana E. Dobzhanskaya, Eugene Helimski, Nadezhda T. Kosterkina, Jean-Luc Lambert, Marina D. Lyublinskaya, N. A. Popov, Florian Sobanski, Eugénie Stapert, Larisa Y. Turdagina, Zsuzsa Várnai, Peter Voliak, Tatjana Zhdanova and possibly other people. Legacy manuscript transcriptions done by Ekaterina P. Boldt, Eugene Helimski, Nadezhda T. Kosterkina, I. E. Machkinis, E. P. Nojfeld, A. K. Stolyarova, Natalia M. Tereshchenko and Tatjana Zhdanova. Texts published by Ekaterina P. ...
format Dataset
genre Nganasan*
samoyed*
genre_facet Nganasan*
samoyed*
geographic Nadezhda
Gusev
Zhdanova
geographic_facet Nadezhda
Gusev
Zhdanova
id ftunihamburgdata:oai:fdr.uni-hamburg.de:17419
institution Open Polar
language unknown
long_lat ENVELOPE(64.167,64.167,-72.733,-72.733)
ENVELOPE(43.341,43.341,66.102,66.102)
ENVELOPE(103.874,103.874,76.357,76.357)
op_collection_id ftunihamburgdata
op_doi https://doi.org/10.25592/uhhfdm.1741910.25592/uhhfdm.17418
op_relation info:eu-repo/semantics/altIdentifier/handle/11022/0000-0007-FE63-C
doi:10.25592/uhhfdm.17418
doi:10.25592/uhhfdm.17419
op_rights info:eu-repo/semantics/openAccess
https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
publishDate 2025
record_format openpolar
spelling ftunihamburgdata:oai:fdr.uni-hamburg.de:17419 2025-06-15T14:40:36+00:00 INEL Nganasan Corpus Brykina, Maria Gusev, Valentin Szeverényi, Sándor Wagner-Nagy, Beáta Lazarenko, Elena Riaposov, Aleksandr Lehmberg, Timm Wagner-Nagy, Beáta Arkhipov, Alexandre 2025-05-02 https://www.fdr.uni-hamburg.de/record/17419 https://doi.org/10.25592/uhhfdm.17419 nio unknown info:eu-repo/semantics/altIdentifier/handle/11022/0000-0007-FE63-C doi:10.25592/uhhfdm.17418 doi:10.25592/uhhfdm.17419 info:eu-repo/semantics/openAccess https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode Uralic Samoyedic Nganasan endangered language language contact language documentation legacy data INEL AdWHH text corpus speech corpus parallel texts folklore tales narrative song transcription time-aligned audio morphological glossing part-of-speech borrowings code-switching existential predication locative predication possessive predication English translation Russian translation EXMARaLDA ELAN XML ISO/TEI info:eu-repo/semantics/other dataset 2025 ftunihamburgdata https://doi.org/10.25592/uhhfdm.1741910.25592/uhhfdm.17418 2025-05-19T03:13:49Z Corpus Citation Brykina, Maria; Gusev, Valentin; Szeverényi, Sándor; Wagner-Nagy, Beáta. INEL Nganasan Corpus. Version 1.0. Publication date 2025-05-02. https://hdl.handle.net/11022/0000-0007-FE63-C. Archived at Universität Hamburg. In: The INEL corpora of indigenous Northern Eurasian languages. https://hdl.handle.net/11022/0000-0007-F45A-1 Corpus Description The INEL Nganasan corpus has been created within the long-term INEL project ("Grammatical Descriptions, Corpora and Language Technology for Indigenous Northern Eurasian Languages"), 2016–2033. The corpus is largely based on the Nganasan Spoken Language Corpus, which has been adapted to the INEL standards and supplemented with new texts. The corpus makes possible typologically oriented corpus-based research on Nganasan and expands the documentation of the lesser described indigenous languages of Northern Eurasia. The INEL Nganasan corpus consists of two parts. The glossed (searchable) part of the corpus includes texts provided with source media files (whenever available) and annotated transcripts. The archival part of the corpus contains non-glossed texts, represented either by audio recordings (optionally – with preliminary transcriptions) or scanned pages of the manuscripts or publications. The corpus includes texts recorded between 1933–2019 in Nganasan. The sources of the corpus are: Audio recordings done by Maria Brykina, Valentin Gusev, Sándor Szeverényi and Beáta Wagner-Nagy. Legacy audio recordings done by A. Aksyonova, Svetlana S. Aksyonova, Josefina Budzisch, Michael Daniel, Oksana E. Dobzhanskaya, Eugene Helimski, Nadezhda T. Kosterkina, Jean-Luc Lambert, Marina D. Lyublinskaya, N. A. Popov, Florian Sobanski, Eugénie Stapert, Larisa Y. Turdagina, Zsuzsa Várnai, Peter Voliak, Tatjana Zhdanova and possibly other people. Legacy manuscript transcriptions done by Ekaterina P. Boldt, Eugene Helimski, Nadezhda T. Kosterkina, I. E. Machkinis, E. P. Nojfeld, A. K. Stolyarova, Natalia M. Tereshchenko and Tatjana Zhdanova. Texts published by Ekaterina P. ... Dataset Nganasan* samoyed* Unknown Nadezhda ENVELOPE(64.167,64.167,-72.733,-72.733) Gusev ENVELOPE(43.341,43.341,66.102,66.102) Zhdanova ENVELOPE(103.874,103.874,76.357,76.357)
spellingShingle Uralic
Samoyedic
Nganasan
endangered language
language contact
language documentation
legacy data
INEL
AdWHH
text corpus
speech corpus
parallel texts
folklore
tales
narrative
song
transcription
time-aligned
audio
morphological glossing
part-of-speech
borrowings
code-switching
existential predication
locative predication
possessive predication
English translation
Russian translation
EXMARaLDA
ELAN
XML
ISO/TEI
Brykina, Maria
Gusev, Valentin
Szeverényi, Sándor
Wagner-Nagy, Beáta
INEL Nganasan Corpus
title INEL Nganasan Corpus
title_full INEL Nganasan Corpus
title_fullStr INEL Nganasan Corpus
title_full_unstemmed INEL Nganasan Corpus
title_short INEL Nganasan Corpus
title_sort inel nganasan corpus
topic Uralic
Samoyedic
Nganasan
endangered language
language contact
language documentation
legacy data
INEL
AdWHH
text corpus
speech corpus
parallel texts
folklore
tales
narrative
song
transcription
time-aligned
audio
morphological glossing
part-of-speech
borrowings
code-switching
existential predication
locative predication
possessive predication
English translation
Russian translation
EXMARaLDA
ELAN
XML
ISO/TEI
topic_facet Uralic
Samoyedic
Nganasan
endangered language
language contact
language documentation
legacy data
INEL
AdWHH
text corpus
speech corpus
parallel texts
folklore
tales
narrative
song
transcription
time-aligned
audio
morphological glossing
part-of-speech
borrowings
code-switching
existential predication
locative predication
possessive predication
English translation
Russian translation
EXMARaLDA
ELAN
XML
ISO/TEI
url https://www.fdr.uni-hamburg.de/record/17419
https://doi.org/10.25592/uhhfdm.17419