数据学习系统

异构编码联邦学习

  • 史洪玮 ,
  • 洪道诚 ,
  • 施连敏 ,
  • 杨迎尧
展开
  • 1. 宿迁学院 信息工程学院, 江苏 宿迁 223800
    2. 华东师范大学 上海智能教育研究院 & 计算机科学与技术学院, 上海 200062
    3. 苏州大学 计算机科学与技术学院, 江苏 苏州 215008
    4. 武夷学院 认知计算与智能信息处理福建省高校重点实验室, 福建 武夷山 354300
史洪玮, 男, 博士研究生, 副教授, 研究方向为智能仪器. E-mail: shwtongxin@squ.edu.cn

收稿日期: 2023-07-19

  录用日期: 2023-07-19

  网络出版日期: 2023-09-20

基金资助

国家自然科学基金 (61977025); 2021年度江苏省重点研发计划 (现代农业) 项目 (BE2021354); 2020宿迁市项目(Z2020133); 2021宿迁市现代农业项目 (L202109); 福建省高校重点实验室开放课题基金 (KLCCIIP2021201); 苏州市科技计划项目 (SNG201908)

Heterogeneous coding-based federated learning

  • Hongwei SHI ,
  • Daocheng HONG ,
  • Lianmin SHI ,
  • Yingyao YANG
Expand
  • 1. School of Information Engineering, Suqian University, Suqian, Jiangsu 223800, China
    2. Shanghai Institute of AI for Education & School of Computer Science and Technology, East China Normal University, Shanghai 200062, China
    3. School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215008, China
    4. The Key Laboratory of Cognitive Computing and Intelligent Information Processing of Fujian Education Institutions, Wuyi University, Wuyishan, Fujian 354300, China

Received date: 2023-07-19

  Accepted date: 2023-07-19

  Online published: 2023-09-20

摘要

异构联邦学习系统中的个人电脑、嵌入式设备等多种边缘设备, 存在资源受限的掉队者设备降低联邦学习系统训练效率的问题. 针对此问题, 本文提出了异构编码联邦学习(heterogeneous coded-based federated learning, HCFL)系统框架, 以实现: ①提高系统训练效率, 加快多掉队者场景下的异构联邦学习 (federated learning, FL)训练速度; ②提供一定级别的数据隐私保护. HCFL方案分别从客户端和服务器角度出发设计了调度策略, 以满足通用环境下多掉队者模型计算加速; 同时设计了线性编码计算方案(linear coded computing, LCC)为任务分发提供数据保护. 实验结果表明, 当异构FL中设备之间性能差异较大时, HCFL能够将训练时间缩短89.85%.

本文引用格式

史洪玮 , 洪道诚 , 施连敏 , 杨迎尧 . 异构编码联邦学习[J]. 华东师范大学学报(自然科学版), 2023 , 2023(5) : 110 -121 . DOI: 10.3969/j.issn.1000-5641.2023.05.010

Abstract

In heterogeneous federated learning systems, among a variety of edge devices such as personal computers and embedded devices, resource-constrained devices, i.e. stragglers, reduce the training efficiency of the federated learning system. This paper proposes a heterogeneous coded federated learning (HCFL) system to ① improve the training efficiency of the system and speed up the training of heterogeneous federated learning (FL) for multiple stragglers, ② provide a certain level of data privacy protection. The HCFL scheme designs scheduling strategies from the perspective of client and server to satisfy the accelerated calculation of multiple stragglers model in the general environment. In addition, a linear coded computing (LCC) scheme is designed to provide data protection for task distribution. The experimental results show that HCFL can reduce training time by 89.85% when the performance difference between devices is large.

参考文献

1 KONECNY J, MCMAHAN H B, YU F, et al. Federated learning: Strategies for improving communication efficiency [EB/OL]. (2017-10-30)[2023-07-06]. https://arxiv.org/abs/1610.05492.
2 KAIROUZ P, MCMAHAN H B, AVENT B, et al.. Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 2021, 14 (1/2): 1- 210.
3 MCMAHAN H B, MOORE E, RAMAGE D, et al. Communication-efficient learning of deep networks from decentralized data [C] // Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS). 2017, 54: 1273-1282.
4 LI T, SAHU A K, ZAHEER M, et al.. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems, 2020, (2): 429- 450.
5 LIU R, WU F, WU C, et al. No one left behind: Inclusive federated learning over heterogeneous devices [C] // Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022: 3398-3406.
6 ZHAO Z, FENG C, YANG H H, et al.. Federated-learning-enabled intelligent fog radio access networks: Fundamental theory, key techniques, and future trends. IEEE Wireless Communications, 2020, 27 (2): 22- 28.
7 YANG H, ARAFA A, QUEK T, et al. Age-based scheduling policy for federated learning In mobile edge networks [C] // Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2020: 8743-8747.
8 刘汪根, 郑淮城, 荣国平.. 云环境下大规模分布式计算数据感知的调度系统. 大数据, 2020, 6 (1): 81- 98.
9 SHEN S, ZHU T, WU D, et al. From distributed machine learning to federated learning: In the view of data privacy and security [J]. Concurrency and Computation: Practice and Experience, 2020: 1-19.
10 刘俊旭, 孟小峰.. 机器学习的隐私保护研究综述. 计算机研究与发展, 2020, 57 (2): 346- 362.
11 PAPERNOT N, MCDANIEL P, SINHA A, et al. Towards the science of security and privacy in machine learning [EB/OL]. (2016-11-11)[2023-07-06]. https://arxiv.org/abs/1611.03814.
12 刘艺璇, 陈红, 刘宇涵, 等.. 联邦学习中的隐私保护技术. 软件学报, 2022, 33 (3): 1057- 1092.
13 YU X, YAN Z, VASILAKOS A.. A survey of verifiable computation. Mobile Networks and Applications, 2017, 22 (3): 438- 453.
14 李宁波, 周潭平, 车小亮, 等.. 多密钥全同态加密研究. 密码学报, 2020, 7 (6): 713- 734.
15 TRUEX S, BARACALDO N, ANWAR A, et al. A hybrid approach to privacy-preserving federated learning [C] // Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security. 2019: 1-11.
16 BERNSTEIN G, SHELDON D. Differentially private Bayesian linear regression [C] // Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019: 525-535.
17 LEE K, LAM M, PEDARSANI R, et al.. Speeding up distributed machine learning using codes. IEEE Transactions on Information Theory, 2017, 64 (3): 1514- 1529.
18 郑腾飞, 周桐庆, 蔡志平, 等.. 编码计算研究综述. 计算机研究与发展, 2021, 58 (10): 2187- 2212.
19 CAO C M, WANG J, WANG J P, et al. Optimal task allocation and coding design for secure coded edge computing [C] // Proceedings of the 39th IEEE International Conference on Distributed Computing Systems (ICDCS). 2019: 1083-1093.
20 DHAKAL S, PRAKASH S, YONA Y, et al. Coded federated learning [C] // Proceedings of the IEEE Globecom Workshops (GC Wkshps). 2019: 1-6.
21 CHELLAPILLA K, PURI S, SIMARD P. High performance convolutional neural networks for document processing [EB/OL]. (2006-11-09)[2023-07-06]. https://inria.hal.science/inria-00112631/document.
文章导航

/