计算机科学

基于网络压缩与切割的深度模型边云协同加速机制研究

  • 王诺 ,
  • 李丽颖 ,
  • 钱栋炜 ,
  • 魏同权
展开
  • 华东师范大学 计算机科学与技术学院, 上海 200062

收稿日期: 2020-09-10

  网络出版日期: 2021-11-26

Research on an Edge-Cloud collaborative acceleration mechanism of deep model based on network compression and partitioning

  • Nuo WANG ,
  • Liying LI ,
  • Dongwei QIAN ,
  • Tongquan WEI
Expand
  • School of Computer Science and Technologye, East China Normal University, Shanghai 200062, China

Received date: 2020-09-10

  Online published: 2021-11-26

摘要

人工智能(Artificial Intelligence,AI)的先进技术已被广泛应用于实时地处理大量数据, 以期实现快速响应. 但是, 部署基于AI的各种应用程序的常规方法带来了巨大的计算和通信开销. 为了解决这一问题, 提出了一种基于网络压缩与切割技术的深度模型边云协同加速机制, 该技术可以压缩和划分深度神经网络(Deep Neural Networks, DNN)模型, 以边云协同的形式在实际应用中实现人工智能模型的快速响应. 首先压缩神经网络, 以降低神经网络所需要的运行时延, 并生成可用作候选分割点的新层, 然后训练预测模型以找到最佳分割点, 并将压缩的神经网络模型分为两部分. 将所获得的两部分分别部署在设备和云端服务器中, 这两个部分可以协同地将总延迟降至最低. 实验结果表明, 与4种基准测试方法相比, 本文所提出的方案可以将深度模型的总延迟至少降低70%.

本文引用格式

王诺 , 李丽颖 , 钱栋炜 , 魏同权 . 基于网络压缩与切割的深度模型边云协同加速机制研究[J]. 华东师范大学学报(自然科学版), 2021 , 2021(6) : 112 -123 . DOI: 10.3969/j.issn.1000-5641.2021.06.012

Abstract

The advanced capabilities of artificial intelligence (AI) have been widely used to process large volumes of data in real-time for achieving rapid response. In contrast, conventional methods for deploying various AI-based applications can result in substantial computational and communication overhead. To solve this problem, a deep model Edge-Cloud collaborative acceleration mechanism based on network compression and partitioning technology is proposed. This technology can compress and partition deep neural networks (DNN), and deploy artificial intelligence models in practical applications in the form of an Edge-Cloud collaboration for rapid response. As a first step, the proposed method compresses the neural network to reduce the execution latency required and generates a new layer that can be used as a candidate partition point. It then trains a series of prediction models to find the best partitioning point and partitions the compressed neural network model into two parts. The two parts obtained are deployed in the edge device and the cloud server, respectively, and these two parts can collaborate to minimize the overall latency. Experimental results show that, compared with four benchmarking methods, the proposed scheme can reduce the total delay of the depth model by more than 70%.

参考文献

1 ESHRATIFAR A E, ESMAILI A, PEDRAM M. Towards collaborative intelligence friendly architectures for deep learning [C]// The 20th International Symposium on Quality Electronic Design (ISQED). 2019: 14-19.
2 DUBEY A, CHATTERJEE M, AHUJA N. Coreset-based neural network compression [C]// European Conference on Computer Vision. 2018: 454-470.
3 SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. (2015-04-10)[2021-01-04].https://arxiv.org/pdf/1409.1556.pdf.
4 LUO J H, WU J, LIN W. ThiNet: A filter level pruning method for deep neural network compression [C]// 2017 IEEE International Conference on Computer Vision. IEEE, 2017: 5068-5076.
5 FU S, LI Z, LIU K, et al. Model compression for IoT applications in industry 4.0 via multi-scale knowledge transfer. IEEE Transactions on Industrial Informatics, 2020, (9): 6013- 6022.
6 SODHRO A H, PIRBHULAL S, ALBUQUERQUE V. Artificial intelligence-driven mechanism for edge computing-based industrial applications. IEEE Transactions on Industrial Informatics, 2019, (7): 4235- 4243.
7 Lü L, BEZDEK J C, HE X L, et al. Fog-embedded deep learning for the internet of things. IEEE Transactions on Industrial Informatics, 2019, (7): 4206- 4215.
8 WANG T, LUO H, JIA W, et al. MTES: An intelligent trust evaluation scheme in sensor-cloud-enabled industrial internet of things. IEEE Transactions on Industrial Informatics, 2020, 16 (3): 2054- 2062.
9 LIKAMWA R, WANG Z, CARROLL A, et al. Draining our glass: An energy and heat characterization of google glass [C]// Proceedings of 5th Asia-Pacific Workshop on Systems. 2014. DOI: 10.1145/2637166.2637230.
10 KANG Y, HAUSWALD J, CAO G, et al. Neurosurgeon: Collaborative intelligence between the cloud and mobile edge. ACM SIGPLAN Notices, 2017, 52 (1): 615- 629.
11 ZHOU Z, CHEN X, LI E, et al. Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE, 2019, 107 (8): 1738- 1762.
12 ULLRICH K, MEEDS E, WELLING M. Soft weight-sharing for neural network compression [EB/OL]. (2017–05–09) [2020–03–01]. https://arxiv.org/pdf/1702.04008.pdf.
13 LIN S, JI R, YAN C, et al. Towards optimal structured CNN pruning via generative adversarial learning [C]// Conference on Computer Vision and Pattern Recognition. 2019: 2790-2799.
14 HOSSEINI M. On the complexity reduction of dense layers from O(N2) to O(NlogN) with cyclic sparsely connected layers [D]. Baltimore County, Maryland: University of Maryland, 2019.
15 WANG Y, LIANG S W, LI H W , et al. A none-sparse inference accelerator that distills and reuses the computation redundancy in CNNs [C]// Proceedings of the 56th Annual Design Automation Conference. 2019. DOI: 10.1145/3316781.3317749.
16 IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size [EB/OL]. (2016–11–04) [2020–03–15]. https://arxiv.org/abs/1602.07360.
17 SOTOUDEH M, BAGHSORKHI S S. C3-Flow: Compute compression co-design flow for deep neural networks [C]// Proceedings of the 56th Annual Design Automation Conference. 2019: Article No.86.
18 LIN S, JI R, CHEN C, et al. Holistic CNN compression via low-rank decomposition with knowledge transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41 (12): 2889- 2905.
19 KO J H, NA T, AMIR M F, et al. Edge-Host partitioning of deep neural networks with feature space encoding for resource-constrained Internet-of-Things platforms [C]// 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 2018. DOI: 10.1109/AVSS.2018.8639121.
20 LI P, CHEN Z, YANG L, et al, Deep convolutional computation model for feature learning on big data in internet of things [J]. IEEE Transactions on Industrial Informatics, 2018, 14(2): 790-798.
21 CHEN Z, GRYLLIAS K, LI W. Intelligent fault diagnosis for rotary machinery using transferable convolutional neural network. IEEE Transactions on Industrial Informatics, 2020, 16 (1): 339- 349.
22 ZHOU J H, PANG C K, LEWIS F L, et al. Intelligent diagnosis and prognosis of tool wear using dominant feature identification. IEEE Transactions on Industrial Informatics, 2009, 5 (4): 454- 464.
23 HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network [EB/OL]. (2015-03-09) [2020-01-10]. https://arxiv.org/abs/1503.02531.
24 ZHANG Y, HONG G S, YE D, et al. Powder-bed fusion process monitoring by machine vision with hybrid convolutional neural networks. IEEE Transactions on Industrial Informatics, 2020, 16 (9): 5769- 5779.
文章导航

/