Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials
This paper describes a pilot project to develop a machine-readable corpus of early twentieth-century Sm’algyax texts from a large collection of handwritten manuscripts collected by the Tsimshian ethnographer and chief William Beynon. The project seeks to ensure that the materials produced are maxima...
Main Authors: | , |
---|---|
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
University of Hawai'i Press
2010
|
Subjects: | |
Online Access: | http://hdl.handle.net/10125/4466 |
id |
ftunivhawaiimano:oai:scholarspace.manoa.hawaii.edu:10125/4466 |
---|---|
record_format |
openpolar |
spelling |
ftunivhawaiimano:oai:scholarspace.manoa.hawaii.edu:10125/4466 2023-05-15T18:39:26+02:00 Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials Stebbins, Tonya N. Hellwig, Birgit 2010 26 pages application/pdf http://hdl.handle.net/10125/4466 eng eng University of Hawai'i Press Stebbins, Tonya N. and Birgit Hellwig. 2010. Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials. Language Documentation & Conservation 4. 34-59. 1934-5275 http://hdl.handle.net/10125/4466 Creative Commons Attribution Non-Commercial No Derivatives License Attribution Non-Commercial No Derivatives by-nc-nd-nsa CC-BY-NC-ND corpus Sm'algyax William Beynon Tsimshian Article Text 2010 ftunivhawaiimano 2022-07-17T13:28:54Z This paper describes a pilot project to develop a machine-readable corpus of early twentieth-century Sm’algyax texts from a large collection of handwritten manuscripts collected by the Tsimshian ethnographer and chief William Beynon. The project seeks to ensure that the materials produced are maximally accessible to the Tsimshian community. It relates established principles for corpus design to practical issues in language retrieval, recognizing that the corpus will likely function as an intermediate stage between the original manuscripts and any language materials developed by the community. The paper is addressed primarily to linguists working on language retrieval projects but may also be of use to communities who are working with linguists, as it provides insight into the concerns and preoccupations that linguists bring to such tasks. National Foreign Language Resource Center Article in Journal/Newspaper Tsimshian Tsimshian* ScholarSpace at University of Hawaii at Manoa |
institution |
Open Polar |
collection |
ScholarSpace at University of Hawaii at Manoa |
op_collection_id |
ftunivhawaiimano |
language |
English |
topic |
corpus Sm'algyax William Beynon Tsimshian |
spellingShingle |
corpus Sm'algyax William Beynon Tsimshian Stebbins, Tonya N. Hellwig, Birgit Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials |
topic_facet |
corpus Sm'algyax William Beynon Tsimshian |
description |
This paper describes a pilot project to develop a machine-readable corpus of early twentieth-century Sm’algyax texts from a large collection of handwritten manuscripts collected by the Tsimshian ethnographer and chief William Beynon. The project seeks to ensure that the materials produced are maximally accessible to the Tsimshian community. It relates established principles for corpus design to practical issues in language retrieval, recognizing that the corpus will likely function as an intermediate stage between the original manuscripts and any language materials developed by the community. The paper is addressed primarily to linguists working on language retrieval projects but may also be of use to communities who are working with linguists, as it provides insight into the concerns and preoccupations that linguists bring to such tasks. National Foreign Language Resource Center |
format |
Article in Journal/Newspaper |
author |
Stebbins, Tonya N. Hellwig, Birgit |
author_facet |
Stebbins, Tonya N. Hellwig, Birgit |
author_sort |
Stebbins, Tonya N. |
title |
Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials |
title_short |
Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials |
title_full |
Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials |
title_fullStr |
Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials |
title_full_unstemmed |
Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials |
title_sort |
principles and practicalities of corpus design in language retrieval: issues in the digitization of the beynon corpus of early twentieth-century sm’algyax materials |
publisher |
University of Hawai'i Press |
publishDate |
2010 |
url |
http://hdl.handle.net/10125/4466 |
genre |
Tsimshian Tsimshian* |
genre_facet |
Tsimshian Tsimshian* |
op_relation |
Stebbins, Tonya N. and Birgit Hellwig. 2010. Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials. Language Documentation & Conservation 4. 34-59. 1934-5275 http://hdl.handle.net/10125/4466 |
op_rights |
Creative Commons Attribution Non-Commercial No Derivatives License Attribution Non-Commercial No Derivatives by-nc-nd-nsa |
op_rightsnorm |
CC-BY-NC-ND |
_version_ |
1766228325268717568 |