Low-Resource Machine Translation through the Lens of Personalized Federated Learning ...
We present a new approach based on the Personalized Federated Learning algorithm MeritFed that can be applied to Natural Language Tasks with heterogeneous data. We evaluate it on the Low-Resource Machine Translation task, using the dataset from the Large-Scale Multilingual Machine Translation Shared...
Main Authors: | , , , , , |
---|---|
Format: | Report |
Language: | unknown |
Published: |
arXiv
2024
|
Subjects: | |
Online Access: | https://dx.doi.org/10.48550/arxiv.2406.12564 https://arxiv.org/abs/2406.12564 |
id |
ftdatacite:10.48550/arxiv.2406.12564 |
---|---|
record_format |
openpolar |
spelling |
ftdatacite:10.48550/arxiv.2406.12564 2024-09-15T18:33:29+00:00 Low-Resource Machine Translation through the Lens of Personalized Federated Learning ... Moskvoretskii, Viktor Tupitsa, Nazarii Biemann, Chris Horváth, Samuel Gorbunov, Eduard Nikishina, Irina 2024 https://dx.doi.org/10.48550/arxiv.2406.12564 https://arxiv.org/abs/2406.12564 unknown arXiv Creative Commons Attribution Share Alike 4.0 International https://creativecommons.org/licenses/by-sa/4.0/legalcode cc-by-sa-4.0 Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences Article Preprint article CreativeWork 2024 ftdatacite https://doi.org/10.48550/arxiv.2406.12564 2024-07-03T11:26:46Z We present a new approach based on the Personalized Federated Learning algorithm MeritFed that can be applied to Natural Language Tasks with heterogeneous data. We evaluate it on the Low-Resource Machine Translation task, using the dataset from the Large-Scale Multilingual Machine Translation Shared Task (Small Track #2) and the subset of Sami languages from the multilingual benchmark for Finno-Ugric languages. In addition to its effectiveness, MeritFed is also highly interpretable, as it can be applied to track the impact of each language used for training. Our analysis reveals that target dataset size affects weight distribution across auxiliary languages, that unrelated languages do not interfere with the training, and auxiliary optimizer parameters have minimal impact. Our approach is easy to apply with a few lines of code, and we provide scripts for reproducing the experiments at https://github.com/VityaVitalich/MeritFed ... : 18 pages, 7 figures ... Report sami DataCite |
institution |
Open Polar |
collection |
DataCite |
op_collection_id |
ftdatacite |
language |
unknown |
topic |
Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences |
spellingShingle |
Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences Moskvoretskii, Viktor Tupitsa, Nazarii Biemann, Chris Horváth, Samuel Gorbunov, Eduard Nikishina, Irina Low-Resource Machine Translation through the Lens of Personalized Federated Learning ... |
topic_facet |
Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences |
description |
We present a new approach based on the Personalized Federated Learning algorithm MeritFed that can be applied to Natural Language Tasks with heterogeneous data. We evaluate it on the Low-Resource Machine Translation task, using the dataset from the Large-Scale Multilingual Machine Translation Shared Task (Small Track #2) and the subset of Sami languages from the multilingual benchmark for Finno-Ugric languages. In addition to its effectiveness, MeritFed is also highly interpretable, as it can be applied to track the impact of each language used for training. Our analysis reveals that target dataset size affects weight distribution across auxiliary languages, that unrelated languages do not interfere with the training, and auxiliary optimizer parameters have minimal impact. Our approach is easy to apply with a few lines of code, and we provide scripts for reproducing the experiments at https://github.com/VityaVitalich/MeritFed ... : 18 pages, 7 figures ... |
format |
Report |
author |
Moskvoretskii, Viktor Tupitsa, Nazarii Biemann, Chris Horváth, Samuel Gorbunov, Eduard Nikishina, Irina |
author_facet |
Moskvoretskii, Viktor Tupitsa, Nazarii Biemann, Chris Horváth, Samuel Gorbunov, Eduard Nikishina, Irina |
author_sort |
Moskvoretskii, Viktor |
title |
Low-Resource Machine Translation through the Lens of Personalized Federated Learning ... |
title_short |
Low-Resource Machine Translation through the Lens of Personalized Federated Learning ... |
title_full |
Low-Resource Machine Translation through the Lens of Personalized Federated Learning ... |
title_fullStr |
Low-Resource Machine Translation through the Lens of Personalized Federated Learning ... |
title_full_unstemmed |
Low-Resource Machine Translation through the Lens of Personalized Federated Learning ... |
title_sort |
low-resource machine translation through the lens of personalized federated learning ... |
publisher |
arXiv |
publishDate |
2024 |
url |
https://dx.doi.org/10.48550/arxiv.2406.12564 https://arxiv.org/abs/2406.12564 |
genre |
sami |
genre_facet |
sami |
op_rights |
Creative Commons Attribution Share Alike 4.0 International https://creativecommons.org/licenses/by-sa/4.0/legalcode cc-by-sa-4.0 |
op_doi |
https://doi.org/10.48550/arxiv.2406.12564 |
_version_ |
1810475191068786688 |