Community-based corpus-building: Three case studies

We describe three ongoing projects involving different First Peoples’ languages of Canada (Cree/nehiyawewin, Dene Sųłiné, and Nakoda/Stoney) that centre around the recording, transcription, compilation, and analysis of spontaneous oral language use––some narrative, some conversation––using freely av...

Full description

Bibliographic Details
Main Authors:	Rice, Sally, Thunder, Dorothy
Format:	Text
Language:	unknown
Published:	2017
Subjects:	Nakoda
Online Access:	http://hdl.handle.net/10125/42052

id	ftunivhawaiimano:oai:scholarspace.manoa.hawaii.edu:10125/42052
record_format	openpolar
spelling	ftunivhawaiimano:oai:scholarspace.manoa.hawaii.edu:10125/42052 2024-09-15T18:19:02+00:00 Community-based corpus-building: Three case studies Rice, Sally Thunder, Dorothy Rice, Sally Thunder, Dorothy 2017-03-03 application/pdf audio/mpeg http://hdl.handle.net/10125/42052 unknown http://hdl.handle.net/10125/42052 Text Sound 2017 ftunivhawaiimano 2024-08-06T23:39:42Z We describe three ongoing projects involving different First Peoples’ languages of Canada (Cree/nehiyawewin, Dene Sųłiné, and Nakoda/Stoney) that centre around the recording, transcription, compilation, and analysis of spontaneous oral language use––some narrative, some conversation––using freely available, Unicode-savvy corpus software (in this case, AntConc [Anthony 2014]) and little to no up- front annotation or translation into English. Because these languages are all polysynthetic, lemmatization and POS tagging are either unachievable or excessively time-draining and indeterminate activities. Nevertheless, corpus creation can still continue apace and reap huge benefits using the most basic of corpus tools. These projects are consonant with a growing ethos in language documentation circles that advocate for the value of corpus development alongside more traditional documentary activities (cf. McEnery & Ostler 2000, Woodbury 2003, Crowley 2007, Cox 2011, Mosel 2014, Vinogradov 2016). Each corpus is at a different stage of development, yet we hope to persuade community-based colleagues of the enormous benefits that ensue from the deliberate creation and use of a corpus of naturally occurring language data for language analysis and teaching. Direct benefits include ready-to-hand word lists; authentic sample utterances for exemplifying dictionaries, phrasebooks, and grammatical sketches; and a conscientious focus on recording many speakers across different demographic categories, discursive situations, and registers in order to achieve a broad range of usage conditions. A focus on wide and balanced sampling clearly strengthens the data pool from which analyses can follow. But it also results in a closer connection by speakers/learners to important and recurring phenomena in their language rather than to descriptions of phenomena that may have emerged through bilingual situations with a handful of speakers under the direct control of non-speaking linguists (who may have been guided by theoretical concerns ... Text Nakoda ScholarSpace at University of Hawaii at Manoa
institution	Open Polar
collection	ScholarSpace at University of Hawaii at Manoa
op_collection_id	ftunivhawaiimano
language	unknown
description	We describe three ongoing projects involving different First Peoples’ languages of Canada (Cree/nehiyawewin, Dene Sųłiné, and Nakoda/Stoney) that centre around the recording, transcription, compilation, and analysis of spontaneous oral language use––some narrative, some conversation––using freely available, Unicode-savvy corpus software (in this case, AntConc [Anthony 2014]) and little to no up- front annotation or translation into English. Because these languages are all polysynthetic, lemmatization and POS tagging are either unachievable or excessively time-draining and indeterminate activities. Nevertheless, corpus creation can still continue apace and reap huge benefits using the most basic of corpus tools. These projects are consonant with a growing ethos in language documentation circles that advocate for the value of corpus development alongside more traditional documentary activities (cf. McEnery & Ostler 2000, Woodbury 2003, Crowley 2007, Cox 2011, Mosel 2014, Vinogradov 2016). Each corpus is at a different stage of development, yet we hope to persuade community-based colleagues of the enormous benefits that ensue from the deliberate creation and use of a corpus of naturally occurring language data for language analysis and teaching. Direct benefits include ready-to-hand word lists; authentic sample utterances for exemplifying dictionaries, phrasebooks, and grammatical sketches; and a conscientious focus on recording many speakers across different demographic categories, discursive situations, and registers in order to achieve a broad range of usage conditions. A focus on wide and balanced sampling clearly strengthens the data pool from which analyses can follow. But it also results in a closer connection by speakers/learners to important and recurring phenomena in their language rather than to descriptions of phenomena that may have emerged through bilingual situations with a handful of speakers under the direct control of non-speaking linguists (who may have been guided by theoretical concerns ...
author2	Rice, Sally Thunder, Dorothy
format	Text
author	Rice, Sally Thunder, Dorothy
spellingShingle	Rice, Sally Thunder, Dorothy Community-based corpus-building: Three case studies
author_facet	Rice, Sally Thunder, Dorothy
author_sort	Rice, Sally
title	Community-based corpus-building: Three case studies
title_short	Community-based corpus-building: Three case studies
title_full	Community-based corpus-building: Three case studies
title_fullStr	Community-based corpus-building: Three case studies
title_full_unstemmed	Community-based corpus-building: Three case studies
title_sort	community-based corpus-building: three case studies
publishDate	2017
url	http://hdl.handle.net/10125/42052
genre	Nakoda
genre_facet	Nakoda
op_relation	http://hdl.handle.net/10125/42052
_version_	1810457138552635392

Community-based corpus-building: Three case studies

Similar Items