Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training

Translating to and from low-resource polysynthetic languages present numerous challenges for NMT. We present the results of our systems for the English--Inuktitut language pair for the WMT 2020 translation tasks. We investigated the importance of correct morphological segmentation, whether or not ad...

Full description

Bibliographic Details
Main Authors: Roest, Christian, Edman, Lukas, Minnema, Gosse, Kelly, Kevin, Spenader, Jennifer, Toral, Antonio
Format: Article in Journal/Newspaper
Language:English
Published: Association for Computational Linguistics (ACL) 2020
Subjects:
Online Access:https://hdl.handle.net/11370/ce246963-3b30-4064-ab65-ae9e5e506c5e
https://research.rug.nl/en/publications/ce246963-3b30-4064-ab65-ae9e5e506c5e
https://pure.rug.nl/ws/files/156505029/2020.wmt_1.29.pdf
id ftunigroningenpu:oai:pure.rug.nl:publications/ce246963-3b30-4064-ab65-ae9e5e506c5e
record_format openpolar
spelling ftunigroningenpu:oai:pure.rug.nl:publications/ce246963-3b30-4064-ab65-ae9e5e506c5e 2024-09-15T18:10:15+00:00 Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training Roest, Christian Edman, Lukas Minnema, Gosse Kelly, Kevin Spenader, Jennifer Toral, Antonio 2020-11 application/pdf https://hdl.handle.net/11370/ce246963-3b30-4064-ab65-ae9e5e506c5e https://research.rug.nl/en/publications/ce246963-3b30-4064-ab65-ae9e5e506c5e https://pure.rug.nl/ws/files/156505029/2020.wmt_1.29.pdf eng eng Association for Computational Linguistics (ACL) https://research.rug.nl/en/publications/ce246963-3b30-4064-ab65-ae9e5e506c5e info:eu-repo/semantics/openAccess Roest , C , Edman , L , Minnema , G , Kelly , K , Spenader , J & Toral , A 2020 , Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training . in Proceedings of the Fifth Conference on Machine Translation (WMT) . Association for Computational Linguistics (ACL) , pp. 274-281 , Fifth Conference on Machine Translation , 19/11/2020 . contributionToPeriodical 2020 ftunigroningenpu 2024-07-01T14:49:23Z Translating to and from low-resource polysynthetic languages present numerous challenges for NMT. We present the results of our systems for the English--Inuktitut language pair for the WMT 2020 translation tasks. We investigated the importance of correct morphological segmentation, whether or not adding data from a related language (Greenlandic) helps, and whether using contextual word embeddings improves translation. While each method showed some promise, the results are mixed. Article in Journal/Newspaper greenlandic inuktitut University of Groningen research database
institution Open Polar
collection University of Groningen research database
op_collection_id ftunigroningenpu
language English
description Translating to and from low-resource polysynthetic languages present numerous challenges for NMT. We present the results of our systems for the English--Inuktitut language pair for the WMT 2020 translation tasks. We investigated the importance of correct morphological segmentation, whether or not adding data from a related language (Greenlandic) helps, and whether using contextual word embeddings improves translation. While each method showed some promise, the results are mixed.
format Article in Journal/Newspaper
author Roest, Christian
Edman, Lukas
Minnema, Gosse
Kelly, Kevin
Spenader, Jennifer
Toral, Antonio
spellingShingle Roest, Christian
Edman, Lukas
Minnema, Gosse
Kelly, Kevin
Spenader, Jennifer
Toral, Antonio
Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training
author_facet Roest, Christian
Edman, Lukas
Minnema, Gosse
Kelly, Kevin
Spenader, Jennifer
Toral, Antonio
author_sort Roest, Christian
title Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training
title_short Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training
title_full Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training
title_fullStr Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training
title_full_unstemmed Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training
title_sort machine translation for english--inuktitut with segmentation, data acquisition and pre-training
publisher Association for Computational Linguistics (ACL)
publishDate 2020
url https://hdl.handle.net/11370/ce246963-3b30-4064-ab65-ae9e5e506c5e
https://research.rug.nl/en/publications/ce246963-3b30-4064-ab65-ae9e5e506c5e
https://pure.rug.nl/ws/files/156505029/2020.wmt_1.29.pdf
genre greenlandic
inuktitut
genre_facet greenlandic
inuktitut
op_source Roest , C , Edman , L , Minnema , G , Kelly , K , Spenader , J & Toral , A 2020 , Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training . in Proceedings of the Fifth Conference on Machine Translation (WMT) . Association for Computational Linguistics (ACL) , pp. 274-281 , Fifth Conference on Machine Translation , 19/11/2020 .
op_relation https://research.rug.nl/en/publications/ce246963-3b30-4064-ab65-ae9e5e506c5e
op_rights info:eu-repo/semantics/openAccess
_version_ 1810447852747358208