Models for Inuktitut-English Word Alignment

This paper presents a set of techniques for bitext word alignment, optimized for a language pair with the characteristics of Inuktitut-English. The resulting systems exploit cross-lingual affinities at the sublexical level of syllables and substrings, as well as regular patterns of transliteration a...

Full description

Bibliographic Details
Main Authors: Charles Schafer, Elliott Franco Drabek
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Published: 2005
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.59.8482
http://acl.ldc.upenn.edu/W/W05/W05-0811.pdf
Description
Summary:This paper presents a set of techniques for bitext word alignment, optimized for a language pair with the characteristics of Inuktitut-English. The resulting systems exploit cross-lingual affinities at the sublexical level of syllables and substrings, as well as regular patterns of transliteration and the tendency towards monotonicity of alignment. Our most successful systems were based on classifier combination, and we found different combination methods performed best under the target evaluation metrics of F-measure and alignment error rate.