Journal of East China Normal University(Natural Science) ›› 2021, Vol. 2021 ›› Issue (6): 112-123.doi: 10.3969/j.issn.1000-5641.2021.06.012

• Computer Science • Previous Articles    

Research on an Edge-Cloud collaborative acceleration mechanism of deep model based on network compression and partitioning

Nuo WANG, Liying LI, Dongwei QIAN, Tongquan WEI*()   

  1. School of Computer Science and Technologye, East China Normal University, Shanghai 200062, China
  • Received:2020-09-10 Online:2021-11-25 Published:2021-11-26
  • Contact: Tongquan WEI E-mail:tqwei@cs.ecnu.edu

Abstract:

The advanced capabilities of artificial intelligence (AI) have been widely used to process large volumes of data in real-time for achieving rapid response. In contrast, conventional methods for deploying various AI-based applications can result in substantial computational and communication overhead. To solve this problem, a deep model Edge-Cloud collaborative acceleration mechanism based on network compression and partitioning technology is proposed. This technology can compress and partition deep neural networks (DNN), and deploy artificial intelligence models in practical applications in the form of an Edge-Cloud collaboration for rapid response. As a first step, the proposed method compresses the neural network to reduce the execution latency required and generates a new layer that can be used as a candidate partition point. It then trains a series of prediction models to find the best partitioning point and partitions the compressed neural network model into two parts. The two parts obtained are deployed in the edge device and the cloud server, respectively, and these two parts can collaborate to minimize the overall latency. Experimental results show that, compared with four benchmarking methods, the proposed scheme can reduce the total delay of the depth model by more than 70%.

Key words: Edge-Cloud collaboration, DNN compression, DNN partitioning

CLC Number: