Unsupervised Learning of Morphology for English and Inuktitut

We describe a simple unsupervised technique for learning morphology by identifying hubs in an automaton. For our purposes, a hub is a node in a graph with in-degree greater than one and out-degree greater than one. We create a word-trie, transform it into a minimal DFA, then identify hubs. Those hub...

Full description

Bibliographic Details
Main Authors: Howard Johnson, Joel Martin
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Published: 2003
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.8011
http://acl.ldc.upenn.edu/N/N03/N03-2015.pdf
Description
Summary:We describe a simple unsupervised technique for learning morphology by identifying hubs in an automaton. For our purposes, a hub is a node in a graph with in-degree greater than one and out-degree greater than one. We create a word-trie, transform it into a minimal DFA, then identify hubs. Those hubs mark the boundary between root and suffix, achieving similar performance to more complex mixtures of techniques.