NRC systems for the 2020 Inuktitut–English news translation task
We describe the National Research Council of Canada (NRC) submissions for the 2020 Inuktitut–English shared task on news translation at the Fifth Conference on Machine Translation (WMT20). Our submissions consist of ensembled domain-specific finetuned transformer models, trained using the Nunavut Ha...
Main Authors: | , , , |
---|---|
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
Association of Computational Linguistics
2020
|
Subjects: | |
Online Access: | https://nrc-publications.canada.ca/eng/view/accepted/?id=e06a1d9c-5574-4ea1-8b93-1ab28090e851 https://nrc-publications.canada.ca/eng/view/object/?id=e06a1d9c-5574-4ea1-8b93-1ab28090e851 https://nrc-publications.canada.ca/fra/voir/objet/?id=e06a1d9c-5574-4ea1-8b93-1ab28090e851 |
Summary: | We describe the National Research Council of Canada (NRC) submissions for the 2020 Inuktitut–English shared task on news translation at the Fifth Conference on Machine Translation (WMT20). Our submissions consist of ensembled domain-specific finetuned transformer models, trained using the Nunavut Hansard and news data and, in the case of Inuktitut–English, backtranslated news and parliamentary data. In this work we explore challenges related to the relatively small amount of parallel data, morphological complexity, and domain shifts. Peer reviewed: Yes NRC publication: Yes |
---|