Modeling Saint Lawrence Island Yupik morphology to support revitalization
This thesis explores two approaches to modeling the morphology of Saint Lawrence Island Yupik, the native language of the Indigenous people of Saint Lawrence Island, Alaska. Since the beginning of the new millennium, Saint Lawrence Island Yupik, hereafter Yupik, has been waning in usage among its Na...
Main Author: | |
---|---|
Other Authors: | , , , |
Format: | Thesis |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/2142/120250 http://hdl.handle.net/2142/120250 |
Summary: | This thesis explores two approaches to modeling the morphology of Saint Lawrence Island Yupik, the native language of the Indigenous people of Saint Lawrence Island, Alaska. Since the beginning of the new millennium, Saint Lawrence Island Yupik, hereafter Yupik, has been waning in usage among its Native population, particularly among younger generations. Unless steps are taken to reverse these trends, the language, already classified as ``Shifting'' on the EGIDS scale, will likely become dormant within the next century. Exacerbating this is the relative lack of documentation on Yupik, as compared to some other languages in Inuit-Yupik-Unangam Tunuu language family. Linguistic research has been intermittent over the past few decades, and the few Yupik-language texts that do exist were digitized only recently (within the past six years). From a computational perspective then, Saint Lawrence Island Yupik is considered low-resource. Despite these challenges, the Yupik community has expressed a desire for revitalization, and the research described in this thesis seeks to support that effort. The contributions of this work are thus twofold, one being documentary and the other being computational. In particular, this thesis extends and updates existing documentation of the approximately 500 derivational morphemes listed in the Badten et al. (2008) dictionary. Each morpheme was assessed for productivity and example sentences were elicited from speakers that demonstrate their usage. The sentences form a dataset that not only illustrates Yupik as it is spoken in present day, but also serves as an evaluation set for the morphological models implemented later. The remainder of the thesis explores two computational approaches to modeling Yupik morphology. The first constitutes a rule-based approach, involving finite-state transducers, and two such models were implemented (Chen and Schwartz, 2018; Chen et al., 2020). They differed structurally and subsequently yielded differing results in performance. The second is a ... |
---|