Translating a low-resource language using GPT-3 and a human-readable dictionary ...
We investigate how well words in the polysynthetic language Inuktitut can be translated by combining dictionary definitions, without use of a neural machine translation model trained on parallel text. Such a translation system would allow natural language technology to benefit from resources designe...
Main Authors: | , , |
---|---|
Format: | Article in Journal/Newspaper |
Language: | unknown |
Published: |
Underline Science Inc.
2023
|
Subjects: | |
Online Access: | https://dx.doi.org/10.48448/n8my-km42 https://underline.io/lecture/79409-translating-a-low-resource-language-using-gpt-3-and-a-human-readable-dictionary |
id |
ftdatacite:10.48448/n8my-km42 |
---|---|
record_format |
openpolar |
spelling |
ftdatacite:10.48448/n8my-km42 2024-04-28T08:26:34+00:00 Translating a low-resource language using GPT-3 and a human-readable dictionary ... Association for Computational Linguistics 2023 Elsner, Micha Needle, Jordan 2023 https://dx.doi.org/10.48448/n8my-km42 https://underline.io/lecture/79409-translating-a-low-resource-language-using-gpt-3-and-a-human-readable-dictionary unknown Underline Science Inc. Language Models Natural Language Processing Natural language generation article MediaObject Conference talk Audiovisual 2023 ftdatacite https://doi.org/10.48448/n8my-km42 2024-04-02T10:41:40Z We investigate how well words in the polysynthetic language Inuktitut can be translated by combining dictionary definitions, without use of a neural machine translation model trained on parallel text. Such a translation system would allow natural language technology to benefit from resources designed for community use in a language revitalization or education program, rather than requiring a separate parallel corpus. We show that the text-to-text generation capabilities of GPT-3 allow it to perform this task with BLEU scores of up to 18.5. We investigate prompting GPT-3 to provide multiple translations, which can help slightly, and providing it with grammar information, which is mostly ineffective. Finally, we test GPT-3's ability to derive morpheme definitions from whole-word translations, but find this process is prone to errors including hallucinations. ... Article in Journal/Newspaper inuktitut DataCite Metadata Store (German National Library of Science and Technology) |
institution |
Open Polar |
collection |
DataCite Metadata Store (German National Library of Science and Technology) |
op_collection_id |
ftdatacite |
language |
unknown |
topic |
Language Models Natural Language Processing Natural language generation |
spellingShingle |
Language Models Natural Language Processing Natural language generation Association for Computational Linguistics 2023 Elsner, Micha Needle, Jordan Translating a low-resource language using GPT-3 and a human-readable dictionary ... |
topic_facet |
Language Models Natural Language Processing Natural language generation |
description |
We investigate how well words in the polysynthetic language Inuktitut can be translated by combining dictionary definitions, without use of a neural machine translation model trained on parallel text. Such a translation system would allow natural language technology to benefit from resources designed for community use in a language revitalization or education program, rather than requiring a separate parallel corpus. We show that the text-to-text generation capabilities of GPT-3 allow it to perform this task with BLEU scores of up to 18.5. We investigate prompting GPT-3 to provide multiple translations, which can help slightly, and providing it with grammar information, which is mostly ineffective. Finally, we test GPT-3's ability to derive morpheme definitions from whole-word translations, but find this process is prone to errors including hallucinations. ... |
format |
Article in Journal/Newspaper |
author |
Association for Computational Linguistics 2023 Elsner, Micha Needle, Jordan |
author_facet |
Association for Computational Linguistics 2023 Elsner, Micha Needle, Jordan |
author_sort |
Association for Computational Linguistics 2023 |
title |
Translating a low-resource language using GPT-3 and a human-readable dictionary ... |
title_short |
Translating a low-resource language using GPT-3 and a human-readable dictionary ... |
title_full |
Translating a low-resource language using GPT-3 and a human-readable dictionary ... |
title_fullStr |
Translating a low-resource language using GPT-3 and a human-readable dictionary ... |
title_full_unstemmed |
Translating a low-resource language using GPT-3 and a human-readable dictionary ... |
title_sort |
translating a low-resource language using gpt-3 and a human-readable dictionary ... |
publisher |
Underline Science Inc. |
publishDate |
2023 |
url |
https://dx.doi.org/10.48448/n8my-km42 https://underline.io/lecture/79409-translating-a-low-resource-language-using-gpt-3-and-a-human-readable-dictionary |
genre |
inuktitut |
genre_facet |
inuktitut |
op_doi |
https://doi.org/10.48448/n8my-km42 |
_version_ |
1797585895298170880 |