Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries
This article presents a novel way of combining finite-state transducers (FSTs) with electronic dictionaries, thereby creating efficient reading comprehension dictionaries. We compare a North Saami- Norwegian and a South Saami- Norwegian dictionary, both enriched with an FST, with existing, available...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Text |
Language: | English |
Subjects: | |
Online Access: | http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.658.4282 http://emmtee.net/oe/nodalida13/conference/45.pdf |
id |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.658.4282 |
---|---|
record_format |
openpolar |
spelling |
ftciteseerx:oai:CiteSeerX.psu:10.1.1.658.4282 2023-05-15T18:08:14+02:00 Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries Ryan Johnson Lene Antonsen Trond Trosterud The Pennsylvania State University CiteSeerX Archives application/pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.658.4282 http://emmtee.net/oe/nodalida13/conference/45.pdf en eng http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.658.4282 http://emmtee.net/oe/nodalida13/conference/45.pdf Metadata may be used without restrictions as long as the oai identifier remains attached to it. http://emmtee.net/oe/nodalida13/conference/45.pdf Lexicography Computational Morphology Orthographic Variation Finite-state Transducers Electronic Dictionaries text ftciteseerx 2016-01-08T16:43:32Z This article presents a novel way of combining finite-state transducers (FSTs) with electronic dictionaries, thereby creating efficient reading comprehension dictionaries. We compare a North Saami- Norwegian and a South Saami- Norwegian dictionary, both enriched with an FST, with existing, available dictionaries containing pre-generated paradigms, and show the advantages of our approach. Being more flexible, the FSTs may also adjust the dictionary to different contexts. The finite state transducer analyses the word to be looked up, and the dictionary itself conducts the actual lookup. The FST part is crucial for morphology-rich languages, where as little as 10 % of the wordforms in running text actually consists of lemma forms. If a compound or derived word, or a word with an enclitic particle is not found in the dictionary, the FST will give the stems and derivation affixes of the wordform, and each of the stems will be given a separate translation. In this way, the coverage of the FST-dictionary will be far larger than an ordinary dictionary of the same size. Text saami Unknown Lemma ENVELOPE(19.530,19.530,69.873,69.873) |
institution |
Open Polar |
collection |
Unknown |
op_collection_id |
ftciteseerx |
language |
English |
topic |
Lexicography Computational Morphology Orthographic Variation Finite-state Transducers Electronic Dictionaries |
spellingShingle |
Lexicography Computational Morphology Orthographic Variation Finite-state Transducers Electronic Dictionaries Ryan Johnson Lene Antonsen Trond Trosterud Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries |
topic_facet |
Lexicography Computational Morphology Orthographic Variation Finite-state Transducers Electronic Dictionaries |
description |
This article presents a novel way of combining finite-state transducers (FSTs) with electronic dictionaries, thereby creating efficient reading comprehension dictionaries. We compare a North Saami- Norwegian and a South Saami- Norwegian dictionary, both enriched with an FST, with existing, available dictionaries containing pre-generated paradigms, and show the advantages of our approach. Being more flexible, the FSTs may also adjust the dictionary to different contexts. The finite state transducer analyses the word to be looked up, and the dictionary itself conducts the actual lookup. The FST part is crucial for morphology-rich languages, where as little as 10 % of the wordforms in running text actually consists of lemma forms. If a compound or derived word, or a word with an enclitic particle is not found in the dictionary, the FST will give the stems and derivation affixes of the wordform, and each of the stems will be given a separate translation. In this way, the coverage of the FST-dictionary will be far larger than an ordinary dictionary of the same size. |
author2 |
The Pennsylvania State University CiteSeerX Archives |
format |
Text |
author |
Ryan Johnson Lene Antonsen Trond Trosterud |
author_facet |
Ryan Johnson Lene Antonsen Trond Trosterud |
author_sort |
Ryan Johnson |
title |
Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries |
title_short |
Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries |
title_full |
Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries |
title_fullStr |
Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries |
title_full_unstemmed |
Using Finite State Transducers for Making Efficient Reading Comprehension Dictionaries |
title_sort |
using finite state transducers for making efficient reading comprehension dictionaries |
url |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.658.4282 http://emmtee.net/oe/nodalida13/conference/45.pdf |
long_lat |
ENVELOPE(19.530,19.530,69.873,69.873) |
geographic |
Lemma |
geographic_facet |
Lemma |
genre |
saami |
genre_facet |
saami |
op_source |
http://emmtee.net/oe/nodalida13/conference/45.pdf |
op_relation |
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.658.4282 http://emmtee.net/oe/nodalida13/conference/45.pdf |
op_rights |
Metadata may be used without restrictions as long as the oai identifier remains attached to it. |
_version_ |
1766180502862036992 |