Journal of East China Normal University(Natural Science) ›› 2022, Vol. 2022 ›› Issue (5): 115-125.doi: 10.3969/j.issn.1000-5641.2022.05.010

• Construction and Analysis of Supply Chain Knowledge Graph • Previous Articles     Next Articles

Correlation operation based on intermediate layers for knowledge method

Haojie WU1, Yanjie WANG2, Wenbing CAI2, Fei WANG3, Yang LIU4, Peng PU5, Shaohui LIN4,*()   

  1. 1. The 27th Research Institute of China Electronics Technology Group Corporation, Zhengzhou 450047, China
    2. Beijing Institute of Tracking and Telecommunication Technology, Beijing 100094, China
    3. Unit 63726 of the Chinese People’s Liberation Army, Yinchuan 750004, China
    4. School of Computer Science and Technology, East China Normal University, Shanghai 200062, China
    5. School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
  • Received:2022-07-08 Online:2022-09-25 Published:2022-09-26
  • Contact: Shaohui LIN E-mail:shlin@cs.ecnu.edu.cn

Abstract:

Convolutional neural networks have made remarkable achievements in artificial intelligence, such as blockchain, speech recognition, and image understanding. However, improvement in model performance is accompanied by a substantial increase in the computational and parameter overhead, leading to a series of problems, such as a slow inference speed, large memory consumption, and difficulty of deployment on mobile devices. Knowledge distillation serves as a typical model compression method, and can transfer knowledge from the teacher network to the student network to improve the latter’s performance without any increase in the number of parameters. A method for extracting representative knowledge for distillation has become the core issue in this field. In this paper, we present a new knowledge distillation method based on intermediate correlation operation, which with the help of data augmentation captures the learning and transformation process of image features during each middle layer stage of the network. We model this feature transform procedure using a correlation operation to extract a new representation from the teacher network to guide the training of the student network. The experimental results demonstrate that our method achieves the best performance on both the CIFAR-10 and CIFAR-100 datasets, in comparison to previous state-of-the-art methods.

Key words: convolutional neural networks, model compression, knowledge distillation, knowledge representation, correlation operation

CLC Number: