Low-Resource Machine Translation through the Lens of Personalized Federated Learning ...

We present a new approach based on the Personalized Federated Learning algorithm MeritFed that can be applied to Natural Language Tasks with heterogeneous data. We evaluate it on the Low-Resource Machine Translation task, using the dataset from the Large-Scale Multilingual Machine Translation Shared...

Full description

Bibliographic Details
Main Authors: Moskvoretskii, Viktor, Tupitsa, Nazarii, Biemann, Chris, Horváth, Samuel, Gorbunov, Eduard, Nikishina, Irina
Format: Report
Language:unknown
Published: arXiv 2024
Subjects:
Online Access:https://dx.doi.org/10.48550/arxiv.2406.12564
https://arxiv.org/abs/2406.12564
id ftdatacite:10.48550/arxiv.2406.12564
record_format openpolar
spelling ftdatacite:10.48550/arxiv.2406.12564 2024-09-15T18:33:29+00:00 Low-Resource Machine Translation through the Lens of Personalized Federated Learning ... Moskvoretskii, Viktor Tupitsa, Nazarii Biemann, Chris Horváth, Samuel Gorbunov, Eduard Nikishina, Irina 2024 https://dx.doi.org/10.48550/arxiv.2406.12564 https://arxiv.org/abs/2406.12564 unknown arXiv Creative Commons Attribution Share Alike 4.0 International https://creativecommons.org/licenses/by-sa/4.0/legalcode cc-by-sa-4.0 Computation and Language cs.CL Machine Learning cs.LG FOS Computer and information sciences Article Preprint article CreativeWork 2024 ftdatacite https://doi.org/10.48550/arxiv.2406.12564 2024-07-03T11:26:46Z We present a new approach based on the Personalized Federated Learning algorithm MeritFed that can be applied to Natural Language Tasks with heterogeneous data. We evaluate it on the Low-Resource Machine Translation task, using the dataset from the Large-Scale Multilingual Machine Translation Shared Task (Small Track #2) and the subset of Sami languages from the multilingual benchmark for Finno-Ugric languages. In addition to its effectiveness, MeritFed is also highly interpretable, as it can be applied to track the impact of each language used for training. Our analysis reveals that target dataset size affects weight distribution across auxiliary languages, that unrelated languages do not interfere with the training, and auxiliary optimizer parameters have minimal impact. Our approach is easy to apply with a few lines of code, and we provide scripts for reproducing the experiments at https://github.com/VityaVitalich/MeritFed ... : 18 pages, 7 figures ... Report sami DataCite
institution Open Polar
collection DataCite
op_collection_id ftdatacite
language unknown
topic Computation and Language cs.CL
Machine Learning cs.LG
FOS Computer and information sciences
spellingShingle Computation and Language cs.CL
Machine Learning cs.LG
FOS Computer and information sciences
Moskvoretskii, Viktor
Tupitsa, Nazarii
Biemann, Chris
Horváth, Samuel
Gorbunov, Eduard
Nikishina, Irina
Low-Resource Machine Translation through the Lens of Personalized Federated Learning ...
topic_facet Computation and Language cs.CL
Machine Learning cs.LG
FOS Computer and information sciences
description We present a new approach based on the Personalized Federated Learning algorithm MeritFed that can be applied to Natural Language Tasks with heterogeneous data. We evaluate it on the Low-Resource Machine Translation task, using the dataset from the Large-Scale Multilingual Machine Translation Shared Task (Small Track #2) and the subset of Sami languages from the multilingual benchmark for Finno-Ugric languages. In addition to its effectiveness, MeritFed is also highly interpretable, as it can be applied to track the impact of each language used for training. Our analysis reveals that target dataset size affects weight distribution across auxiliary languages, that unrelated languages do not interfere with the training, and auxiliary optimizer parameters have minimal impact. Our approach is easy to apply with a few lines of code, and we provide scripts for reproducing the experiments at https://github.com/VityaVitalich/MeritFed ... : 18 pages, 7 figures ...
format Report
author Moskvoretskii, Viktor
Tupitsa, Nazarii
Biemann, Chris
Horváth, Samuel
Gorbunov, Eduard
Nikishina, Irina
author_facet Moskvoretskii, Viktor
Tupitsa, Nazarii
Biemann, Chris
Horváth, Samuel
Gorbunov, Eduard
Nikishina, Irina
author_sort Moskvoretskii, Viktor
title Low-Resource Machine Translation through the Lens of Personalized Federated Learning ...
title_short Low-Resource Machine Translation through the Lens of Personalized Federated Learning ...
title_full Low-Resource Machine Translation through the Lens of Personalized Federated Learning ...
title_fullStr Low-Resource Machine Translation through the Lens of Personalized Federated Learning ...
title_full_unstemmed Low-Resource Machine Translation through the Lens of Personalized Federated Learning ...
title_sort low-resource machine translation through the lens of personalized federated learning ...
publisher arXiv
publishDate 2024
url https://dx.doi.org/10.48550/arxiv.2406.12564
https://arxiv.org/abs/2406.12564
genre sami
genre_facet sami
op_rights Creative Commons Attribution Share Alike 4.0 International
https://creativecommons.org/licenses/by-sa/4.0/legalcode
cc-by-sa-4.0
op_doi https://doi.org/10.48550/arxiv.2406.12564
_version_ 1810475191068786688