Utilizing Language Technology in the Documentation of Endangered Uralic Languages

The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi language documentation projects, all of which record new spoken language data, digitize available recordings and annotate these multimedia data in order to provide comprehensive language corpora as databases for future...

Full description

Bibliographic Details
Published in:Northern European Journal of Language Technology
Main Authors: Gerstenberger, Ciprian, Partanen, Niko, Rießler, Michael, Wilbur, Joshua
Format: Article in Journal/Newspaper
Language:unknown
Published: Linkoping University Electronic Press 2016
Subjects:
Online Access:http://dx.doi.org/10.3384/nejlt.2000-1533.1643
https://nejlt.ep.liu.se/article/download/1660/1006
id crlinkopinguep:10.3384/nejlt.2000-1533.1643
record_format openpolar
spelling crlinkopinguep:10.3384/nejlt.2000-1533.1643 2024-09-15T18:17:01+00:00 Utilizing Language Technology in the Documentation of Endangered Uralic Languages Gerstenberger, Ciprian Partanen, Niko Rießler, Michael Wilbur, Joshua 2016 http://dx.doi.org/10.3384/nejlt.2000-1533.1643 https://nejlt.ep.liu.se/article/download/1660/1006 unknown Linkoping University Electronic Press https://creativecommons.org/licenses/by-nc/4.0 Northern European Journal of Language Technology volume 4 ISSN 2000-1533 journal-article 2016 crlinkopinguep https://doi.org/10.3384/nejlt.2000-1533.1643 2024-06-28T04:01:14Z The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi language documentation projects, all of which record new spoken language data, digitize available recordings and annotate these multimedia data in order to provide comprehensive language corpora as databases for future research on and for endangered – and under-described – Uralic speech communities. Applying language technology in language documentation helps us to create more systematically annotated corpora, rather than eclectic data collections. Specifically, we describe a script providing interactivity between different morphosyntactic analysis modules implemented as Finite State Transducers and ELAN, a Graphical User Interface tool for annotating and presenting multimodal corpora. Ultimately, the spoken corpora created in our projects will be useful for scientifically significant quantitative investigations on these languages in the future. Article in Journal/Newspaper Komi language saami LiU Electronic Press (Linköping University) Northern European Journal of Language Technology 4
institution Open Polar
collection LiU Electronic Press (Linköping University)
op_collection_id crlinkopinguep
language unknown
description The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi language documentation projects, all of which record new spoken language data, digitize available recordings and annotate these multimedia data in order to provide comprehensive language corpora as databases for future research on and for endangered – and under-described – Uralic speech communities. Applying language technology in language documentation helps us to create more systematically annotated corpora, rather than eclectic data collections. Specifically, we describe a script providing interactivity between different morphosyntactic analysis modules implemented as Finite State Transducers and ELAN, a Graphical User Interface tool for annotating and presenting multimodal corpora. Ultimately, the spoken corpora created in our projects will be useful for scientifically significant quantitative investigations on these languages in the future.
format Article in Journal/Newspaper
author Gerstenberger, Ciprian
Partanen, Niko
Rießler, Michael
Wilbur, Joshua
spellingShingle Gerstenberger, Ciprian
Partanen, Niko
Rießler, Michael
Wilbur, Joshua
Utilizing Language Technology in the Documentation of Endangered Uralic Languages
author_facet Gerstenberger, Ciprian
Partanen, Niko
Rießler, Michael
Wilbur, Joshua
author_sort Gerstenberger, Ciprian
title Utilizing Language Technology in the Documentation of Endangered Uralic Languages
title_short Utilizing Language Technology in the Documentation of Endangered Uralic Languages
title_full Utilizing Language Technology in the Documentation of Endangered Uralic Languages
title_fullStr Utilizing Language Technology in the Documentation of Endangered Uralic Languages
title_full_unstemmed Utilizing Language Technology in the Documentation of Endangered Uralic Languages
title_sort utilizing language technology in the documentation of endangered uralic languages
publisher Linkoping University Electronic Press
publishDate 2016
url http://dx.doi.org/10.3384/nejlt.2000-1533.1643
https://nejlt.ep.liu.se/article/download/1660/1006
genre Komi language
saami
genre_facet Komi language
saami
op_source Northern European Journal of Language Technology
volume 4
ISSN 2000-1533
op_rights https://creativecommons.org/licenses/by-nc/4.0
op_doi https://doi.org/10.3384/nejlt.2000-1533.1643
container_title Northern European Journal of Language Technology
container_volume 4
_version_ 1810455002114686976