You can’t suggest that?!

In this article, we study correction of spelling errors, specifically on how the spelling errors are made and how can we model them computationally in order to fix them. The article describes two different approaches to generating spelling correction suggestions for three Uralic languages: Estonian,...

Full description

Bibliographic Details
Published in:	Nordlyd
Main Authors:	Heiki-Jaan Kaalep, Flammie Pirinen, Sjur Moshagen
Format:	Article in Journal/Newspaper
Language:	English Norwegian
Published:	Septentrio Academic Publishing 2022
Subjects:	Spell-Checking rule-based fsa machine learning sami languages estonian Language. Linguistic theory. Comparative grammar P101-410 North Sámi sami Sámi South Sámi
Online Access:	https://doi.org/10.7557/12.6349 https://doaj.org/article/66fab39dfd704bbc974aa4e1b585c8fa

id	ftdoajarticles:oai:doaj.org/article:66fab39dfd704bbc974aa4e1b585c8fa
record_format	openpolar
spelling	ftdoajarticles:oai:doaj.org/article:66fab39dfd704bbc974aa4e1b585c8fa 2023-05-15T17:40:07+02:00 You can’t suggest that?! Heiki-Jaan Kaalep Flammie Pirinen Sjur Moshagen 2022-08-01T00:00:00Z https://doi.org/10.7557/12.6349 https://doaj.org/article/66fab39dfd704bbc974aa4e1b585c8fa EN NO eng nor Septentrio Academic Publishing https://septentrio.uit.no/index.php/nordlyd/article/view/6349 https://doaj.org/toc/1503-8599 doi:10.7557/12.6349 1503-8599 https://doaj.org/article/66fab39dfd704bbc974aa4e1b585c8fa Nordlyd: Tromsø University Working Papers on Language & Linguistics, Vol 46, Iss 1 (2022) Spell-Checking rule-based fsa machine learning sami languages estonian Language. Linguistic theory. Comparative grammar P101-410 article 2022 ftdoajarticles https://doi.org/10.7557/12.6349 2022-12-30T20:03:56Z In this article, we study correction of spelling errors, specifically on how the spelling errors are made and how can we model them computationally in order to fix them. The article describes two different approaches to generating spelling correction suggestions for three Uralic languages: Estonian, North Sámi and South Sámi. The first approach of modelling spelling errors is rule-based, where experts write rules that describe the kind of errors are made, and these are compiled into finite-state automaton that models the errors. The second is data-based, where we show a machine learning algorithm a corpus of errors that humans have made, and it creates a neural network that can model the errors. Both approaches require collection of error corpora and understanding its contents; therefore we also describe the actual errors we have seen in detail. We find that while both approaches create error correction systems, with current resources the expert-build systems are still more reliable. Article in Journal/Newspaper North Sámi sami Sámi South Sámi Directory of Open Access Journals: DOAJ Articles Nordlyd 46 1
institution	Open Polar
collection	Directory of Open Access Journals: DOAJ Articles
op_collection_id	ftdoajarticles
language	English Norwegian
topic	Spell-Checking rule-based fsa machine learning sami languages estonian Language. Linguistic theory. Comparative grammar P101-410
spellingShingle	Spell-Checking rule-based fsa machine learning sami languages estonian Language. Linguistic theory. Comparative grammar P101-410 Heiki-Jaan Kaalep Flammie Pirinen Sjur Moshagen You can’t suggest that?!
topic_facet	Spell-Checking rule-based fsa machine learning sami languages estonian Language. Linguistic theory. Comparative grammar P101-410
description	In this article, we study correction of spelling errors, specifically on how the spelling errors are made and how can we model them computationally in order to fix them. The article describes two different approaches to generating spelling correction suggestions for three Uralic languages: Estonian, North Sámi and South Sámi. The first approach of modelling spelling errors is rule-based, where experts write rules that describe the kind of errors are made, and these are compiled into finite-state automaton that models the errors. The second is data-based, where we show a machine learning algorithm a corpus of errors that humans have made, and it creates a neural network that can model the errors. Both approaches require collection of error corpora and understanding its contents; therefore we also describe the actual errors we have seen in detail. We find that while both approaches create error correction systems, with current resources the expert-build systems are still more reliable.
format	Article in Journal/Newspaper
author	Heiki-Jaan Kaalep Flammie Pirinen Sjur Moshagen
author_facet	Heiki-Jaan Kaalep Flammie Pirinen Sjur Moshagen
author_sort	Heiki-Jaan Kaalep
title	You can’t suggest that?!
title_short	You can’t suggest that?!
title_full	You can’t suggest that?!
title_fullStr	You can’t suggest that?!
title_full_unstemmed	You can’t suggest that?!
title_sort	you can’t suggest that?!
publisher	Septentrio Academic Publishing
publishDate	2022
url	https://doi.org/10.7557/12.6349 https://doaj.org/article/66fab39dfd704bbc974aa4e1b585c8fa
genre	North Sámi sami Sámi South Sámi
genre_facet	North Sámi sami Sámi South Sámi
op_source	Nordlyd: Tromsø University Working Papers on Language & Linguistics, Vol 46, Iss 1 (2022)
op_relation	https://septentrio.uit.no/index.php/nordlyd/article/view/6349 https://doaj.org/toc/1503-8599 doi:10.7557/12.6349 1503-8599 https://doaj.org/article/66fab39dfd704bbc974aa4e1b585c8fa
op_doi	https://doi.org/10.7557/12.6349
container_title	Nordlyd
container_volume	46
container_issue	1
_version_	1766140933153226752

You can’t suggest that?!

Similar Items