Coding for Large-Scale Distributed Machine Learning

This article aims to give a comprehensive and rigorous review of the principles and recent development of coding for large-scale distributed machine learning (DML). With increasing data volumes and the pervasive deployment of sensors and computing machines, machine learning has become more distribut...

Full description

Bibliographic Details
Published in:	Entropy
Main Authors:	Ming Xiao, Mikael Skoglund
Format:	Text
Language:	English
Published:	Multidisciplinary Digital Publishing Institute 2022
Subjects:	error-control coding gradient coding random codes ADMM DML
Online Access:	https://doi.org/10.3390/e24091284

id	ftmdpi:oai:mdpi.com:/1099-4300/24/9/1284/
record_format	openpolar
spelling	ftmdpi:oai:mdpi.com:/1099-4300/24/9/1284/ 2023-08-20T04:06:08+02:00 Coding for Large-Scale Distributed Machine Learning Ming Xiao Mikael Skoglund 2022-09-12 application/pdf https://doi.org/10.3390/e24091284 EN eng Multidisciplinary Digital Publishing Institute Information Theory, Probability and Statistics https://dx.doi.org/10.3390/e24091284 https://creativecommons.org/licenses/by/4.0/ Entropy; Volume 24; Issue 9; Pages: 1284 error-control coding gradient coding random codes ADMM Text 2022 ftmdpi https://doi.org/10.3390/e24091284 2023-08-01T06:26:31Z This article aims to give a comprehensive and rigorous review of the principles and recent development of coding for large-scale distributed machine learning (DML). With increasing data volumes and the pervasive deployment of sensors and computing machines, machine learning has become more distributed. Moreover, the involved computing nodes and data volumes for learning tasks have also increased significantly. For large-scale distributed learning systems, significant challenges have appeared in terms of delay, errors, efficiency, etc. To address the problems, various error-control or performance-boosting schemes have been proposed recently for different aspects, such as the duplication of computing nodes. More recently, error-control coding has been investigated for DML to improve reliability and efficiency. The benefits of coding for DML include high-efficiency, low complexity, etc. Despite the benefits and recent progress, however, there is still a lack of comprehensive survey on this topic, especially for large-scale learning. This paper seeks to introduce the theories and algorithms of coding for DML. For primal-based DML schemes, we first discuss the gradient coding with the optimal code distance. Then, we introduce random coding for gradient-based DML. For primal–dual-based DML, i.e., ADMM (alternating direction method of multipliers), we propose a separate coding method for two steps of distributed optimization. Then coding schemes for different steps are discussed. Finally, a few potential directions for future works are also given. Text DML MDPI Open Access Publishing Entropy 24 9 1284
institution	Open Polar
collection	MDPI Open Access Publishing
op_collection_id	ftmdpi
language	English
topic	error-control coding gradient coding random codes ADMM
spellingShingle	error-control coding gradient coding random codes ADMM Ming Xiao Mikael Skoglund Coding for Large-Scale Distributed Machine Learning
topic_facet	error-control coding gradient coding random codes ADMM
description	This article aims to give a comprehensive and rigorous review of the principles and recent development of coding for large-scale distributed machine learning (DML). With increasing data volumes and the pervasive deployment of sensors and computing machines, machine learning has become more distributed. Moreover, the involved computing nodes and data volumes for learning tasks have also increased significantly. For large-scale distributed learning systems, significant challenges have appeared in terms of delay, errors, efficiency, etc. To address the problems, various error-control or performance-boosting schemes have been proposed recently for different aspects, such as the duplication of computing nodes. More recently, error-control coding has been investigated for DML to improve reliability and efficiency. The benefits of coding for DML include high-efficiency, low complexity, etc. Despite the benefits and recent progress, however, there is still a lack of comprehensive survey on this topic, especially for large-scale learning. This paper seeks to introduce the theories and algorithms of coding for DML. For primal-based DML schemes, we first discuss the gradient coding with the optimal code distance. Then, we introduce random coding for gradient-based DML. For primal–dual-based DML, i.e., ADMM (alternating direction method of multipliers), we propose a separate coding method for two steps of distributed optimization. Then coding schemes for different steps are discussed. Finally, a few potential directions for future works are also given.
format	Text
author	Ming Xiao Mikael Skoglund
author_facet	Ming Xiao Mikael Skoglund
author_sort	Ming Xiao
title	Coding for Large-Scale Distributed Machine Learning
title_short	Coding for Large-Scale Distributed Machine Learning
title_full	Coding for Large-Scale Distributed Machine Learning
title_fullStr	Coding for Large-Scale Distributed Machine Learning
title_full_unstemmed	Coding for Large-Scale Distributed Machine Learning
title_sort	coding for large-scale distributed machine learning
publisher	Multidisciplinary Digital Publishing Institute
publishDate	2022
url	https://doi.org/10.3390/e24091284
genre	DML
genre_facet	DML
op_source	Entropy; Volume 24; Issue 9; Pages: 1284
op_relation	Information Theory, Probability and Statistics https://dx.doi.org/10.3390/e24091284
op_rights	https://creativecommons.org/licenses/by/4.0/
op_doi	https://doi.org/10.3390/e24091284
container_title	Entropy
container_volume	24
container_issue	9
container_start_page	1284
_version_	1774717071227092992

Coding for Large-Scale Distributed Machine Learning

Similar Items