Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus

In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available o...

Full description

Bibliographic Details
Main Authors: Tommi A Pirinen, Krister Lindén
Other Authors: The Pennsylvania State University CiteSeerX Archives
Format: Text
Language:English
Subjects:
Online Access:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.8318
http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf
id ftciteseerx:oai:CiteSeerX.psu:10.1.1.180.8318
record_format openpolar
spelling ftciteseerx:oai:CiteSeerX.psu:10.1.1.180.8318 2023-05-15T17:43:44+02:00 Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus Tommi A Pirinen Krister Lindén The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.8318 http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.8318 http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf text ftciteseerx 2016-01-07T16:28:17Z In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet. 1. Text Northern Sámi Sámi Unknown
institution Open Polar
collection Unknown
op_collection_id ftciteseerx
language English
description In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet. 1.
author2 The Pennsylvania State University CiteSeerX Archives
format Text
author Tommi A Pirinen
Krister Lindén
spellingShingle Tommi A Pirinen
Krister Lindén
Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus
author_facet Tommi A Pirinen
Krister Lindén
author_sort Tommi A Pirinen
title Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus
title_short Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus
title_full Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus
title_fullStr Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus
title_full_unstemmed Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus
title_sort finite-state spell-checking with weighted language and error models—building and evaluating spell-checkers with wikipedia as corpus
url http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.8318
http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf
genre Northern Sámi
Sámi
genre_facet Northern Sámi
Sámi
op_source http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf
op_relation http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.8318
http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf
op_rights Metadata may be used without restrictions as long as the oai identifier remains attached to it.
_version_ 1766145879269441536