The University of Edinburgh's English-Tamil and English-Inuktitut Submissions to the WMT20 News Translation Task

We describe the University of Edinburgh’s submissions to the WMT20 news translation shared task for the low resource language pair English-Tamil and the mid-resource language pair English-Inuktitut. We use the neural machine translation transformer architecture for all submissions and explore a vari...

Full description

Bibliographic Details
Main Authors: Bawden, Birch, Dobreva, Oncevay, Miceli, Williams
Format: Conference Object
Language:unknown
Published: Zenodo 2020
Subjects:
Online Access:https://doi.org/10.5281/zenodo.6672692
Description
Summary:We describe the University of Edinburgh’s submissions to the WMT20 news translation shared task for the low resource language pair English-Tamil and the mid-resource language pair English-Inuktitut. We use the neural machine translation transformer architecture for all submissions and explore a variety of techniques to improve translation quality to compensate for the lack of parallel training data. For the very low-resource English-Tamil, this involves exploring pretraining, using both language model objectives and translation using an unrelated high-resource language pair (German-English), and iterative backtranslation. For English-Inuktitut, we explore the use of multilingual systems, which, despite not being part of the primary submission, would have achieved the best results on the test set.