Distributed Machine Learning Oriented Data Integrity Verification Scheme in Cloud Computing Environment

Distributed Machine Learning (DML) is one of the core technologies for Artificial Intelligence (AI). However, in the existing distributed machine learning framework, the data integrity is not taken into account. If network attackers forge the data, modify the data, or destroy the data, the training...

Full description

Bibliographic Details
Published in:IEEE Access
Main Authors: Xiao-Ping Zhao, Rui Jiang
Format: Article in Journal/Newspaper
Language:English
Published: IEEE 2020
Subjects:
DML
Online Access:https://doi.org/10.1109/ACCESS.2020.2971519
https://doaj.org/article/2a5fa2796ebd45db81e1e50fda353279
Description
Summary:Distributed Machine Learning (DML) is one of the core technologies for Artificial Intelligence (AI). However, in the existing distributed machine learning framework, the data integrity is not taken into account. If network attackers forge the data, modify the data, or destroy the data, the training model in the distributed machine learning system will be greatly affected, and the training results are led to be wrong. Therefore, it is crucial to guarantee the data integrity in the DML. In this paper, we propose a distributed machine learning oriented data integrity verification scheme (DML-DIV) to ensure the integrity of training data. Firstly, we adopt the idea of Provable Data Possession (PDP) sampling auditing algorithm to achieve data integrity verification so that our DML-DIV scheme can resist forgery attacks and tampering attacks. Secondly, we generate a random number, namely blinding factor, and apply the discrete logarithm problem (DLP) to construct proof and ensure privacy protection in the TPA verification process. Thirdly, we employ identity-based cryptography and two-step key generation technology to generate data owner's public/private key pair so that our DML-DIV scheme can solve the key escrow problem and reduce the cost of managing the certificates. Finally, formal theoretical analysis and experimental results show the security and efficiency of our DML-DIV scheme.