Translating a low-resource language using GPT-3 and a human-readable dictionary ...
We investigate how well words in the polysynthetic language Inuktitut can be translated by combining dictionary definitions, without use of a neural machine translation model trained on parallel text. Such a translation system would allow natural language technology to benefit from resources designe...
Main Authors: | , , |
---|---|
Format: | Article in Journal/Newspaper |
Language: | unknown |
Published: |
Underline Science Inc.
2023
|
Subjects: | |
Online Access: | https://dx.doi.org/10.48448/n8my-km42 https://underline.io/lecture/79409-translating-a-low-resource-language-using-gpt-3-and-a-human-readable-dictionary |
Summary: | We investigate how well words in the polysynthetic language Inuktitut can be translated by combining dictionary definitions, without use of a neural machine translation model trained on parallel text. Such a translation system would allow natural language technology to benefit from resources designed for community use in a language revitalization or education program, rather than requiring a separate parallel corpus. We show that the text-to-text generation capabilities of GPT-3 allow it to perform this task with BLEU scores of up to 18.5. We investigate prompting GPT-3 to provide multiple translations, which can help slightly, and providing it with grammar information, which is mostly ineffective. Finally, we test GPT-3's ability to derive morpheme definitions from whole-word translations, but find this process is prone to errors including hallucinations. ... |
---|