Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction
We inspect the viability of finite-state spellchecking and contextless correction of nonword errors in three languages with a large degree of morphological variety. Overviewing previous work, we conduct large-scale tests involving three languages — English, Finnish and Greenlandic — and a variety of...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.361.8502 http://www.aclweb.org/anthology/W/W12/W12-6201.pdf |
Summary: | We inspect the viability of finite-state spellchecking and contextless correction of nonword errors in three languages with a large degree of morphological variety. Overviewing previous work, we conduct large-scale tests involving three languages — English, Finnish and Greenlandic — and a variety of error models and algorithms, including proposed improvements of our own. Special reference is made to on-line three-way composition of the input, the error model and the language model. Tests are run on real-world text acquired from freely available sources. We show that the finite-state approaches discussed are sufficiently fast for high-quality correction, even for Greenlandic which, due to its morphological complexity, is a difficult task for non-finite-state approaches. 1 |
---|