Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus
In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available o...
Main Authors: | , |
---|---|
Other Authors: | , |
Format: | Conference Object |
Language: | English |
Published: |
2012
|
Subjects: | |
Online Access: | http://hdl.handle.net/10138/29358 |
id |
ftunivhelsihelda:oai:helda.helsinki.fi:10138/29358 |
---|---|
record_format |
openpolar |
spelling |
ftunivhelsihelda:oai:helda.helsinki.fi:10138/29358 2024-01-07T09:45:31+01:00 Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus Pirinen, Tommi Linden, Krister Department of Modern Languages 2010-2017 Krister Linden / Research Group 2012-01-25T16:40:05Z application/pdf http://hdl.handle.net/10138/29358 eng eng Proceedings of LREC 2010 2-9517408-6-7 Pirinen , T & Linden , K 2010 , Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus . in Proceedings of LREC 2010 : Workshop on Creation and use of basic lexical resources for less-resourced languages . LREC 2010 , Malta , Malta , 17/05/2010 . conference ORCID: /0000-0003-2337-303X/work/29934361 22c68334-4640-4c5a-b21c-92beb472811a http://hdl.handle.net/10138/29358 restrictedAccess info:eu-repo/semantics/restrictedAccess 612 Languages and Literature Conference contribution submittedVersion 2012 ftunivhelsihelda 2023-12-14T00:14:52Z In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet. Peer reviewed Conference Object Northern Sámi Sámi HELDA – University of Helsinki Open Repository |
institution |
Open Polar |
collection |
HELDA – University of Helsinki Open Repository |
op_collection_id |
ftunivhelsihelda |
language |
English |
topic |
612 Languages and Literature |
spellingShingle |
612 Languages and Literature Pirinen, Tommi Linden, Krister Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
topic_facet |
612 Languages and Literature |
description |
In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet. Peer reviewed |
author2 |
Department of Modern Languages 2010-2017 Krister Linden / Research Group |
format |
Conference Object |
author |
Pirinen, Tommi Linden, Krister |
author_facet |
Pirinen, Tommi Linden, Krister |
author_sort |
Pirinen, Tommi |
title |
Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
title_short |
Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
title_full |
Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
title_fullStr |
Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
title_full_unstemmed |
Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus |
title_sort |
finite-state spell-checking with weighted language and error models : building and evaluating spell-checkers with wikipedia as corpus |
publishDate |
2012 |
url |
http://hdl.handle.net/10138/29358 |
genre |
Northern Sámi Sámi |
genre_facet |
Northern Sámi Sámi |
op_relation |
Proceedings of LREC 2010 2-9517408-6-7 Pirinen , T & Linden , K 2010 , Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus . in Proceedings of LREC 2010 : Workshop on Creation and use of basic lexical resources for less-resourced languages . LREC 2010 , Malta , Malta , 17/05/2010 . conference ORCID: /0000-0003-2337-303X/work/29934361 22c68334-4640-4c5a-b21c-92beb472811a http://hdl.handle.net/10138/29358 |
op_rights |
restrictedAccess info:eu-repo/semantics/restrictedAccess |
_version_ |
1787427067222032384 |