Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training
Translating to and from low-resource polysynthetic languages present numerous challenges for NMT. We present the results of our systems for the English--Inuktitut language pair for the WMT 2020 translation tasks. We investigated the importance of correct morphological segmentation, whether or not ad...
Main Authors: | , , , , , |
---|---|
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Association for Computational Linguistics (ACL)
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/11370/ce246963-3b30-4064-ab65-ae9e5e506c5e https://research.rug.nl/en/publications/ce246963-3b30-4064-ab65-ae9e5e506c5e https://pure.rug.nl/ws/files/156505029/2020.wmt_1.29.pdf |
id |
ftunigroningenpu:oai:pure.rug.nl:publications/ce246963-3b30-4064-ab65-ae9e5e506c5e |
---|---|
record_format |
openpolar |
spelling |
ftunigroningenpu:oai:pure.rug.nl:publications/ce246963-3b30-4064-ab65-ae9e5e506c5e 2024-09-15T18:10:15+00:00 Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training Roest, Christian Edman, Lukas Minnema, Gosse Kelly, Kevin Spenader, Jennifer Toral, Antonio 2020-11 application/pdf https://hdl.handle.net/11370/ce246963-3b30-4064-ab65-ae9e5e506c5e https://research.rug.nl/en/publications/ce246963-3b30-4064-ab65-ae9e5e506c5e https://pure.rug.nl/ws/files/156505029/2020.wmt_1.29.pdf eng eng Association for Computational Linguistics (ACL) https://research.rug.nl/en/publications/ce246963-3b30-4064-ab65-ae9e5e506c5e info:eu-repo/semantics/openAccess Roest , C , Edman , L , Minnema , G , Kelly , K , Spenader , J & Toral , A 2020 , Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training . in Proceedings of the Fifth Conference on Machine Translation (WMT) . Association for Computational Linguistics (ACL) , pp. 274-281 , Fifth Conference on Machine Translation , 19/11/2020 . contributionToPeriodical 2020 ftunigroningenpu 2024-07-01T14:49:23Z Translating to and from low-resource polysynthetic languages present numerous challenges for NMT. We present the results of our systems for the English--Inuktitut language pair for the WMT 2020 translation tasks. We investigated the importance of correct morphological segmentation, whether or not adding data from a related language (Greenlandic) helps, and whether using contextual word embeddings improves translation. While each method showed some promise, the results are mixed. Article in Journal/Newspaper greenlandic inuktitut University of Groningen research database |
institution |
Open Polar |
collection |
University of Groningen research database |
op_collection_id |
ftunigroningenpu |
language |
English |
description |
Translating to and from low-resource polysynthetic languages present numerous challenges for NMT. We present the results of our systems for the English--Inuktitut language pair for the WMT 2020 translation tasks. We investigated the importance of correct morphological segmentation, whether or not adding data from a related language (Greenlandic) helps, and whether using contextual word embeddings improves translation. While each method showed some promise, the results are mixed. |
format |
Article in Journal/Newspaper |
author |
Roest, Christian Edman, Lukas Minnema, Gosse Kelly, Kevin Spenader, Jennifer Toral, Antonio |
spellingShingle |
Roest, Christian Edman, Lukas Minnema, Gosse Kelly, Kevin Spenader, Jennifer Toral, Antonio Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training |
author_facet |
Roest, Christian Edman, Lukas Minnema, Gosse Kelly, Kevin Spenader, Jennifer Toral, Antonio |
author_sort |
Roest, Christian |
title |
Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training |
title_short |
Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training |
title_full |
Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training |
title_fullStr |
Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training |
title_full_unstemmed |
Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training |
title_sort |
machine translation for english--inuktitut with segmentation, data acquisition and pre-training |
publisher |
Association for Computational Linguistics (ACL) |
publishDate |
2020 |
url |
https://hdl.handle.net/11370/ce246963-3b30-4064-ab65-ae9e5e506c5e https://research.rug.nl/en/publications/ce246963-3b30-4064-ab65-ae9e5e506c5e https://pure.rug.nl/ws/files/156505029/2020.wmt_1.29.pdf |
genre |
greenlandic inuktitut |
genre_facet |
greenlandic inuktitut |
op_source |
Roest , C , Edman , L , Minnema , G , Kelly , K , Spenader , J & Toral , A 2020 , Machine Translation for English--Inuktitut with Segmentation, Data Acquisition and Pre-Training . in Proceedings of the Fifth Conference on Machine Translation (WMT) . Association for Computational Linguistics (ACL) , pp. 274-281 , Fifth Conference on Machine Translation , 19/11/2020 . |
op_relation |
https://research.rug.nl/en/publications/ce246963-3b30-4064-ab65-ae9e5e506c5e |
op_rights |
info:eu-repo/semantics/openAccess |
_version_ |
1810447852747358208 |