Translating a low-resource language using GPT-3 and a human-readable dictionary ...

We investigate how well words in the polysynthetic language Inuktitut can be translated by combining dictionary definitions, without use of a neural machine translation model trained on parallel text. Such a translation system would allow natural language technology to benefit from resources designe...

Full description

Bibliographic Details
Main Authors: Association for Computational Linguistics 2023, Elsner, Micha, Needle, Jordan
Format: Article in Journal/Newspaper
Language:unknown
Published: Underline Science Inc. 2023
Subjects:
Online Access:https://dx.doi.org/10.48448/n8my-km42
https://underline.io/lecture/79409-translating-a-low-resource-language-using-gpt-3-and-a-human-readable-dictionary
Description
Summary:We investigate how well words in the polysynthetic language Inuktitut can be translated by combining dictionary definitions, without use of a neural machine translation model trained on parallel text. Such a translation system would allow natural language technology to benefit from resources designed for community use in a language revitalization or education program, rather than requiring a separate parallel corpus. We show that the text-to-text generation capabilities of GPT-3 allow it to perform this task with BLEU scores of up to 18.5. We investigate prompting GPT-3 to provide multiple translations, which can help slightly, and providing it with grammar information, which is mostly ineffective. Finally, we test GPT-3's ability to derive morpheme definitions from whole-word translations, but find this process is prone to errors including hallucinations. ...