Utilizing Language Technology in the Documentation of Endangered Uralic Languages
The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi language documentation projects, all of which record new spoken language data, digitize available recordings and annotate these multimedia data in order to provide comprehensive language corpora as databases for future...
Published in: | Northern European Journal of Language Technology |
---|---|
Main Authors: | , , , |
Format: | Article in Journal/Newspaper |
Language: | unknown |
Published: |
Linkoping University Electronic Press
2016
|
Subjects: | |
Online Access: | http://dx.doi.org/10.3384/nejlt.2000-1533.1643 https://nejlt.ep.liu.se/article/download/1660/1006 |
id |
crlinkopinguep:10.3384/nejlt.2000-1533.1643 |
---|---|
record_format |
openpolar |
spelling |
crlinkopinguep:10.3384/nejlt.2000-1533.1643 2024-09-15T18:17:01+00:00 Utilizing Language Technology in the Documentation of Endangered Uralic Languages Gerstenberger, Ciprian Partanen, Niko Rießler, Michael Wilbur, Joshua 2016 http://dx.doi.org/10.3384/nejlt.2000-1533.1643 https://nejlt.ep.liu.se/article/download/1660/1006 unknown Linkoping University Electronic Press https://creativecommons.org/licenses/by-nc/4.0 Northern European Journal of Language Technology volume 4 ISSN 2000-1533 journal-article 2016 crlinkopinguep https://doi.org/10.3384/nejlt.2000-1533.1643 2024-06-28T04:01:14Z The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi language documentation projects, all of which record new spoken language data, digitize available recordings and annotate these multimedia data in order to provide comprehensive language corpora as databases for future research on and for endangered – and under-described – Uralic speech communities. Applying language technology in language documentation helps us to create more systematically annotated corpora, rather than eclectic data collections. Specifically, we describe a script providing interactivity between different morphosyntactic analysis modules implemented as Finite State Transducers and ELAN, a Graphical User Interface tool for annotating and presenting multimodal corpora. Ultimately, the spoken corpora created in our projects will be useful for scientifically significant quantitative investigations on these languages in the future. Article in Journal/Newspaper Komi language saami LiU Electronic Press (Linköping University) Northern European Journal of Language Technology 4 |
institution |
Open Polar |
collection |
LiU Electronic Press (Linköping University) |
op_collection_id |
crlinkopinguep |
language |
unknown |
description |
The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi language documentation projects, all of which record new spoken language data, digitize available recordings and annotate these multimedia data in order to provide comprehensive language corpora as databases for future research on and for endangered – and under-described – Uralic speech communities. Applying language technology in language documentation helps us to create more systematically annotated corpora, rather than eclectic data collections. Specifically, we describe a script providing interactivity between different morphosyntactic analysis modules implemented as Finite State Transducers and ELAN, a Graphical User Interface tool for annotating and presenting multimodal corpora. Ultimately, the spoken corpora created in our projects will be useful for scientifically significant quantitative investigations on these languages in the future. |
format |
Article in Journal/Newspaper |
author |
Gerstenberger, Ciprian Partanen, Niko Rießler, Michael Wilbur, Joshua |
spellingShingle |
Gerstenberger, Ciprian Partanen, Niko Rießler, Michael Wilbur, Joshua Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
author_facet |
Gerstenberger, Ciprian Partanen, Niko Rießler, Michael Wilbur, Joshua |
author_sort |
Gerstenberger, Ciprian |
title |
Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
title_short |
Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
title_full |
Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
title_fullStr |
Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
title_full_unstemmed |
Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
title_sort |
utilizing language technology in the documentation of endangered uralic languages |
publisher |
Linkoping University Electronic Press |
publishDate |
2016 |
url |
http://dx.doi.org/10.3384/nejlt.2000-1533.1643 https://nejlt.ep.liu.se/article/download/1660/1006 |
genre |
Komi language saami |
genre_facet |
Komi language saami |
op_source |
Northern European Journal of Language Technology volume 4 ISSN 2000-1533 |
op_rights |
https://creativecommons.org/licenses/by-nc/4.0 |
op_doi |
https://doi.org/10.3384/nejlt.2000-1533.1643 |
container_title |
Northern European Journal of Language Technology |
container_volume |
4 |
_version_ |
1810455002114686976 |