NRC systems for the 2020 Inuktitut–English news translation task

We describe the National Research Council of Canada (NRC) submissions for the 2020 Inuktitut–English shared task on news translation at the Fifth Conference on Machine Translation (WMT20). Our submissions consist of ensembled domain-specific finetuned transformer models, trained using the Nunavut Ha...

Full description

Bibliographic Details
Main Authors: Knowles, Rebecca, Stewart, Darlene, Larkin, Samuel, Littell, Patrick
Format: Article in Journal/Newspaper
Language:English
Published: Association of Computational Linguistics 2020
Subjects:
Online Access:https://nrc-publications.canada.ca/eng/view/accepted/?id=e06a1d9c-5574-4ea1-8b93-1ab28090e851
https://nrc-publications.canada.ca/eng/view/object/?id=e06a1d9c-5574-4ea1-8b93-1ab28090e851
https://nrc-publications.canada.ca/fra/voir/objet/?id=e06a1d9c-5574-4ea1-8b93-1ab28090e851
Description
Summary:We describe the National Research Council of Canada (NRC) submissions for the 2020 Inuktitut–English shared task on news translation at the Fifth Conference on Machine Translation (WMT20). Our submissions consist of ensembled domain-specific finetuned transformer models, trained using the Nunavut Hansard and news data and, in the case of Inuktitut–English, backtranslated news and parliamentary data. In this work we explore challenges related to the relatively small amount of parallel data, morphological complexity, and domain shifts. Peer reviewed: Yes NRC publication: Yes