Translating a low-resource language using GPT-3 and a human-readable dictionary ...

We investigate how well words in the polysynthetic language Inuktitut can be translated by combining dictionary definitions, without use of a neural machine translation model trained on parallel text. Such a translation system would allow natural language technology to benefit from resources designe...

Full description

Bibliographic Details
Main Authors: Association for Computational Linguistics 2023, Elsner, Micha, Needle, Jordan
Format: Article in Journal/Newspaper
Language:unknown
Published: Underline Science Inc. 2023
Subjects:
Online Access:https://dx.doi.org/10.48448/n8my-km42
https://underline.io/lecture/79409-translating-a-low-resource-language-using-gpt-3-and-a-human-readable-dictionary
id ftdatacite:10.48448/n8my-km42
record_format openpolar
spelling ftdatacite:10.48448/n8my-km42 2024-04-28T08:26:34+00:00 Translating a low-resource language using GPT-3 and a human-readable dictionary ... Association for Computational Linguistics 2023 Elsner, Micha Needle, Jordan 2023 https://dx.doi.org/10.48448/n8my-km42 https://underline.io/lecture/79409-translating-a-low-resource-language-using-gpt-3-and-a-human-readable-dictionary unknown Underline Science Inc. Language Models Natural Language Processing Natural language generation article MediaObject Conference talk Audiovisual 2023 ftdatacite https://doi.org/10.48448/n8my-km42 2024-04-02T10:41:40Z We investigate how well words in the polysynthetic language Inuktitut can be translated by combining dictionary definitions, without use of a neural machine translation model trained on parallel text. Such a translation system would allow natural language technology to benefit from resources designed for community use in a language revitalization or education program, rather than requiring a separate parallel corpus. We show that the text-to-text generation capabilities of GPT-3 allow it to perform this task with BLEU scores of up to 18.5. We investigate prompting GPT-3 to provide multiple translations, which can help slightly, and providing it with grammar information, which is mostly ineffective. Finally, we test GPT-3's ability to derive morpheme definitions from whole-word translations, but find this process is prone to errors including hallucinations. ... Article in Journal/Newspaper inuktitut DataCite Metadata Store (German National Library of Science and Technology)
institution Open Polar
collection DataCite Metadata Store (German National Library of Science and Technology)
op_collection_id ftdatacite
language unknown
topic Language Models
Natural Language Processing
Natural language generation
spellingShingle Language Models
Natural Language Processing
Natural language generation
Association for Computational Linguistics 2023
Elsner, Micha
Needle, Jordan
Translating a low-resource language using GPT-3 and a human-readable dictionary ...
topic_facet Language Models
Natural Language Processing
Natural language generation
description We investigate how well words in the polysynthetic language Inuktitut can be translated by combining dictionary definitions, without use of a neural machine translation model trained on parallel text. Such a translation system would allow natural language technology to benefit from resources designed for community use in a language revitalization or education program, rather than requiring a separate parallel corpus. We show that the text-to-text generation capabilities of GPT-3 allow it to perform this task with BLEU scores of up to 18.5. We investigate prompting GPT-3 to provide multiple translations, which can help slightly, and providing it with grammar information, which is mostly ineffective. Finally, we test GPT-3's ability to derive morpheme definitions from whole-word translations, but find this process is prone to errors including hallucinations. ...
format Article in Journal/Newspaper
author Association for Computational Linguistics 2023
Elsner, Micha
Needle, Jordan
author_facet Association for Computational Linguistics 2023
Elsner, Micha
Needle, Jordan
author_sort Association for Computational Linguistics 2023
title Translating a low-resource language using GPT-3 and a human-readable dictionary ...
title_short Translating a low-resource language using GPT-3 and a human-readable dictionary ...
title_full Translating a low-resource language using GPT-3 and a human-readable dictionary ...
title_fullStr Translating a low-resource language using GPT-3 and a human-readable dictionary ...
title_full_unstemmed Translating a low-resource language using GPT-3 and a human-readable dictionary ...
title_sort translating a low-resource language using gpt-3 and a human-readable dictionary ...
publisher Underline Science Inc.
publishDate 2023
url https://dx.doi.org/10.48448/n8my-km42
https://underline.io/lecture/79409-translating-a-low-resource-language-using-gpt-3-and-a-human-readable-dictionary
genre inuktitut
genre_facet inuktitut
op_doi https://doi.org/10.48448/n8my-km42
_version_ 1797585895298170880