Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction

We inspect the viability of finite-state spellchecking and contextless correction of nonword errors in three languages with a large degree of morphological variety. Overviewing previous work, we conduct large-scale tests involving three languages — English, Finnish and Greenlandic — and a variety of...

Full description

Bibliographic Details
Main Authors: Tommi A Pirinen, Sam Hardwick
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.361.8502
http://www.aclweb.org/anthology/W/W12/W12-6201.pdf
id ftciteseerx:oai:CiteSeerX.psu:10.1.1.361.8502
record_format openpolar
spelling ftciteseerx:oai:CiteSeerX.psu:10.1.1.361.8502 2023-05-15T16:31:08+02:00 Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction Tommi A Pirinen Sam Hardwick The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.361.8502 http://www.aclweb.org/anthology/W/W12/W12-6201.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.361.8502 http://www.aclweb.org/anthology/W/W12/W12-6201.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www.aclweb.org/anthology/W/W12/W12-6201.pdf text ftciteseerx 2016-01-08T00:52:17Z We inspect the viability of finite-state spellchecking and contextless correction of nonword errors in three languages with a large degree of morphological variety. Overviewing previous work, we conduct large-scale tests involving three languages — English, Finnish and Greenlandic — and a variety of error models and algorithms, including proposed improvements of our own. Special reference is made to on-line three-way composition of the input, the error model and the language model. Tests are run on real-world text acquired from freely available sources. We show that the finite-state approaches discussed are sufficiently fast for high-quality correction, even for Greenlandic which, due to its morphological complexity, is a difficult task for non-finite-state approaches. 1 Text greenlandic Unknown
institution Open Polar
collection Unknown
op_collection_id ftciteseerx
language English
description We inspect the viability of finite-state spellchecking and contextless correction of nonword errors in three languages with a large degree of morphological variety. Overviewing previous work, we conduct large-scale tests involving three languages — English, Finnish and Greenlandic — and a variety of error models and algorithms, including proposed improvements of our own. Special reference is made to on-line three-way composition of the input, the error model and the language model. Tests are run on real-world text acquired from freely available sources. We show that the finite-state approaches discussed are sufficiently fast for high-quality correction, even for Greenlandic which, due to its morphological complexity, is a difficult task for non-finite-state approaches. 1
author2 The Pennsylvania State University CiteSeerX Archives
format Text
author Tommi A Pirinen
Sam Hardwick
spellingShingle Tommi A Pirinen
Sam Hardwick
Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction
author_facet Tommi A Pirinen
Sam Hardwick
author_sort Tommi A Pirinen
title Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction
title_short Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction
title_full Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction
title_fullStr Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction
title_full_unstemmed Effect of Language and Error Models on Efficiency of Finite-State Spell-Checking and Correction
title_sort effect of language and error models on efficiency of finite-state spell-checking and correction
url http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.361.8502
http://www.aclweb.org/anthology/W/W12/W12-6201.pdf
genre greenlandic
genre_facet greenlandic
op_source http://www.aclweb.org/anthology/W/W12/W12-6201.pdf
op_relation http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.361.8502
http://www.aclweb.org/anthology/W/W12/W12-6201.pdf
op_rights Metadata may be used without restrictions as long as the oai identifier remains attached to it.
_version_ 1766020914728665088