Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus
In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available o...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.8318 http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf |
id |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.180.8318 |
---|---|
record_format |
openpolar |
spelling |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.180.8318 2023-05-15T17:43:44+02:00 Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus Tommi A Pirinen Krister Lindén The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.8318 http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.8318 http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf text ftciteseerx 2016-01-07T16:28:17Z In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet. 1. Text Northern Sámi Sámi Unknown |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
ftciteseerx |
language |
English |
description |
In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet. 1. |
author2 |
The Pennsylvania State University CiteSeerX Archives |
format |
Text |
author |
Tommi A Pirinen Krister Lindén |
spellingShingle |
Tommi A Pirinen Krister Lindén Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
author_facet |
Tommi A Pirinen Krister Lindén |
author_sort |
Tommi A Pirinen |
title |
Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
title_short |
Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
title_full |
Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
title_fullStr |
Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
title_full_unstemmed |
Finite-State Spell-Checking with Weighted Language and Error Models—Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
title_sort |
finite-state spell-checking with weighted language and error models—building and evaluating spell-checkers with wikipedia as corpus |
url |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.8318 http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf |
genre |
Northern Sámi Sámi |
genre_facet |
Northern Sámi Sámi |
op_source |
http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf |
op_relation |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.180.8318 http://www.helsinki.fi/%7Etapirine/publications/Pirinen-lrec-2010.pdf |
op_rights |
Metadata may be used without restrictions as long as the oai identifier remains attached to it. |
_version_ |
1766145879269441536 |