Distributed Machine Learning and the Semblance of Trust
The utilisation of large and diverse datasets for machine learning (ML) at scale is required to promote scientific insight into many meaningful problems. However, due to data governance regulations such as GDPR as well as ethical concerns, the aggregation of personal and sensitive data is problemati...
Main Authors: | , , , , |
---|---|
Format: | Article in Journal/Newspaper |
Language: | unknown |
Published: |
arXiv
2021
|
Subjects: | |
Online Access: | https://dx.doi.org/10.48550/arxiv.2112.11040 https://arxiv.org/abs/2112.11040 |
id |
ftdatacite:10.48550/arxiv.2112.11040 |
---|---|
record_format |
openpolar |
spelling |
ftdatacite:10.48550/arxiv.2112.11040 2023-05-15T16:01:56+02:00 Distributed Machine Learning and the Semblance of Trust Usynin, Dmitrii Ziller, Alexander Rueckert, Daniel Passerat-Palmbach, Jonathan Kaissis, Georgios 2021 https://dx.doi.org/10.48550/arxiv.2112.11040 https://arxiv.org/abs/2112.11040 unknown arXiv Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 CC-BY Machine Learning cs.LG Cryptography and Security cs.CR FOS Computer and information sciences Article CreativeWork article Preprint 2021 ftdatacite https://doi.org/10.48550/arxiv.2112.11040 2022-03-10T13:25:33Z The utilisation of large and diverse datasets for machine learning (ML) at scale is required to promote scientific insight into many meaningful problems. However, due to data governance regulations such as GDPR as well as ethical concerns, the aggregation of personal and sensitive data is problematic, which prompted the development of alternative strategies such as distributed ML (DML). Techniques such as Federated Learning (FL) allow the data owner to maintain data governance and perform model training locally without having to share their data. FL and related techniques are often described as privacy-preserving. We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind. We further provide recommendations and examples on how such algorithms can be augmented to provide guarantees of governance, security, privacy and verifiability for a general ML audience without prior exposure to formal privacy techniques. : Accepted at The Third AAAI Workshop on Privacy-Preserving Artificial Intelligence Article in Journal/Newspaper DML DataCite Metadata Store (German National Library of Science and Technology) |
institution |
Open Polar |
collection |
DataCite Metadata Store (German National Library of Science and Technology) |
op_collection_id |
ftdatacite |
language |
unknown |
topic |
Machine Learning cs.LG Cryptography and Security cs.CR FOS Computer and information sciences |
spellingShingle |
Machine Learning cs.LG Cryptography and Security cs.CR FOS Computer and information sciences Usynin, Dmitrii Ziller, Alexander Rueckert, Daniel Passerat-Palmbach, Jonathan Kaissis, Georgios Distributed Machine Learning and the Semblance of Trust |
topic_facet |
Machine Learning cs.LG Cryptography and Security cs.CR FOS Computer and information sciences |
description |
The utilisation of large and diverse datasets for machine learning (ML) at scale is required to promote scientific insight into many meaningful problems. However, due to data governance regulations such as GDPR as well as ethical concerns, the aggregation of personal and sensitive data is problematic, which prompted the development of alternative strategies such as distributed ML (DML). Techniques such as Federated Learning (FL) allow the data owner to maintain data governance and perform model training locally without having to share their data. FL and related techniques are often described as privacy-preserving. We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind. We further provide recommendations and examples on how such algorithms can be augmented to provide guarantees of governance, security, privacy and verifiability for a general ML audience without prior exposure to formal privacy techniques. : Accepted at The Third AAAI Workshop on Privacy-Preserving Artificial Intelligence |
format |
Article in Journal/Newspaper |
author |
Usynin, Dmitrii Ziller, Alexander Rueckert, Daniel Passerat-Palmbach, Jonathan Kaissis, Georgios |
author_facet |
Usynin, Dmitrii Ziller, Alexander Rueckert, Daniel Passerat-Palmbach, Jonathan Kaissis, Georgios |
author_sort |
Usynin, Dmitrii |
title |
Distributed Machine Learning and the Semblance of Trust |
title_short |
Distributed Machine Learning and the Semblance of Trust |
title_full |
Distributed Machine Learning and the Semblance of Trust |
title_fullStr |
Distributed Machine Learning and the Semblance of Trust |
title_full_unstemmed |
Distributed Machine Learning and the Semblance of Trust |
title_sort |
distributed machine learning and the semblance of trust |
publisher |
arXiv |
publishDate |
2021 |
url |
https://dx.doi.org/10.48550/arxiv.2112.11040 https://arxiv.org/abs/2112.11040 |
genre |
DML |
genre_facet |
DML |
op_rights |
Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 |
op_rightsnorm |
CC-BY |
op_doi |
https://doi.org/10.48550/arxiv.2112.11040 |
_version_ |
1766397606575996928 |