Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus

In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available o...

Full description

Bibliographic Details
Main Authors: Pirinen, Tommi, Linden, Krister
Other Authors: Department of Modern Languages 2010-2017, Krister Linden / Research Group
Format: Conference Object
Language:English
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/10138/29358
id ftunivhelsihelda:oai:helda.helsinki.fi:10138/29358
record_format openpolar
spelling ftunivhelsihelda:oai:helda.helsinki.fi:10138/29358 2024-01-07T09:45:31+01:00 Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus Pirinen, Tommi Linden, Krister Department of Modern Languages 2010-2017 Krister Linden / Research Group 2012-01-25T16:40:05Z application/pdf http://hdl.handle.net/10138/29358 eng eng Proceedings of LREC 2010 2-9517408-6-7 Pirinen , T & Linden , K 2010 , Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus . in Proceedings of LREC 2010 : Workshop on Creation and use of basic lexical resources for less-resourced languages . LREC 2010 , Malta , Malta , 17/05/2010 . conference ORCID: /0000-0003-2337-303X/work/29934361 22c68334-4640-4c5a-b21c-92beb472811a http://hdl.handle.net/10138/29358 restrictedAccess info:eu-repo/semantics/restrictedAccess 612 Languages and Literature Conference contribution submittedVersion 2012 ftunivhelsihelda 2023-12-14T00:14:52Z In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet. Peer reviewed Conference Object Northern Sámi Sámi HELDA – University of Helsinki Open Repository
institution Open Polar
collection HELDA – University of Helsinki Open Repository
op_collection_id ftunivhelsihelda
language English
topic 612 Languages and Literature
spellingShingle 612 Languages and Literature
Pirinen, Tommi
Linden, Krister
Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus
topic_facet 612 Languages and Literature
description In this paper we present simple methods for construction and evaluation of finite-state spell-checking tools using an existing finite-state lexical automaton, freely available finite-state tools and Internet corpora acquired from projects such as Wikipedia. As an example, we use a freely available open-source implementation of Finnish morphology, made with traditional finite-state morphology tools, and demonstrate rapid building of Northern Sámi and English spell checkers from tools and resources available from the Internet. Peer reviewed
author2 Department of Modern Languages 2010-2017
Krister Linden / Research Group
format Conference Object
author Pirinen, Tommi
Linden, Krister
author_facet Pirinen, Tommi
Linden, Krister
author_sort Pirinen, Tommi
title Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus
title_short Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus
title_full Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus
title_fullStr Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus
title_full_unstemmed Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus
title_sort finite-state spell-checking with weighted language and error models : building and evaluating spell-checkers with wikipedia as corpus
publishDate 2012
url http://hdl.handle.net/10138/29358
genre Northern Sámi
Sámi
genre_facet Northern Sámi
Sámi
op_relation Proceedings of LREC 2010
2-9517408-6-7
Pirinen , T & Linden , K 2010 , Finite-State Spell-Checking with Weighted Language and Error Models : Building and Evaluating Spell-Checkers with Wikipedia as Corpus . in Proceedings of LREC 2010 : Workshop on Creation and use of basic lexical resources for less-resourced languages . LREC 2010 , Malta , Malta , 17/05/2010 .
conference
ORCID: /0000-0003-2337-303X/work/29934361
22c68334-4640-4c5a-b21c-92beb472811a
http://hdl.handle.net/10138/29358
op_rights restrictedAccess
info:eu-repo/semantics/restrictedAccess
_version_ 1787427067222032384