Summary: | Thesis (M.A.) University of Alaska Fairbanks, 2011 This thesis describes the creation and evaluation of software designed to analyze and generate North Slope Iñupiaq words. Given a complete lñupiaq word as input, it attempts to identify the word's stem and suffixes, including the grammatical category and any inflectional information contained in the word. Given a stem and list of suffixes as input, it attempts to produce the corresponding Iñupiaq word, applying phonological processes as necessary. Innovations in the implementation of this software include Iñupiaq-specific formats for specifying lexical data, including a table-based format for specifying inflectional suffixes in paradigms; a treatment of phonologically-conditioned irregular allomorphy which leverages the pattern-recognition capabilities of the xfst programming language; and an idiom for composing morphographemic rules together in xfst which captures the state of the software each time a new rule is added, maximizing feedback during software compilation and facilitating troubleshooting. In testing, the software recognized 81.2% of all word tokens (78.3% of unique word types) and guessed at the morphology of an additional 16.8% of tokens (19.4% of types). Analyses of recognized words were largely accurate; a heuristic for identifying accurate parses is proposed. Most guesses were at least partly inaccurate. Improvements and applications are proposed. National Science Foundation (Award 0534217)
|