[1] RUSSAKOVSKY O, DENG J, SU H, et al. Imagenet large scale visual recognition challenge [J]. International journal of computer vision, 2015, 115(3): 211-252. [2] HE Y, SAINATH T N, PRABHAVALKAR R, et al. Streaming end-to-end speech recognition for mobile devices [C]//ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2019: 6381-6385. [3] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of deep bidirectional transformers for language understanding [EB/OL]. (2019-05-24)[2020-07-02]. https://arxiv.org/pdf/1810.04805.pdf. [4] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. (2015-04-10)[2020-07-02]. https://arxiv.org/pdf/1409.1556.pdf. [5] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016: 770-778. DOI: 10.1109/CVPR.2016.90. [6] HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2017: 2261-2269. DOI: 10.1109/CVPR.2017.243. [7] CHENG Y, WANG D, ZHOU P, et al. A survey of model compression and acceleration for deep neural networks [EB/OL]. (2020-06-14)[2020-07-02]. https://arxiv.org/pdf/1710.09282.pdf. [8] 雷杰, 高鑫, 宋杰, 等. 深度网络模型压缩综述 [J]. 软件学报, 2018, 29(2): 251-266 [9] CHOUDHARY T, MISHRA V, GOSWAMI A, et al. A comprehensive survey on model compression and acceleration [J/OL]. Artificial Intelligence Review, 2020. (2020-02-08)[2020-07-02]. https://doi.org/10.1007/s10462-020-09816-7. [10] 李江昀, 赵义凯, 薛卓尔, 等. 深度神经网络模型压缩综述 [J]. 工程科学学报, 2019, 41(10): 1229-1239 [11] WANG R J, LI X, LING C X. Pelee: A real-time object detection system on mobile devices [C]// Proceedings of the 32nd International Conference on Neural Information Processing Systems. New York: Curran Associates Inc., 2018: 1967-1976. [12] CHEN X L, GIRSHICK R, HE K M, et al. TensorMask: A foundation for dense object segmentation [C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 2061-2069. [13] SANH V, DEBUT L, CHAUMOND J, et al. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter [EB/OL]. (2020-01-24)[2020-07-01]. https://arxiv.org/pdf/1910.01108v3.pdf. [14] QIN Z, LI Z, ZHANG Z, et al. ThunderNet: Towards real-time generic object detection on mobile devices [C]//Proceedings of the IEEE International Conference on Computer Vision. IEEE, 2019: 6718-6727. [15] ANWAR S, SUNG W. Coarse pruning of convolutional neural networks with random masks[EB/OL]. [2020-07-02]. https://openreview.net/pdf?id=HkvS3Mqxe. [16] LECUN Y, DENKER J S, SOLLA S A. Optimal brain damage [C]//Advances in Neural Information Processing Systems. 1989: 598-605. [17] HASSIBI B, STORK D G. Second order derivatives for network pruning: Optimal brain surgeon [C]//Advances in Neural Information Processing Systems. 1993: 164-171. [18] ZHANG T, YE S, ZHANG K, et al. A systematic dnn weight pruning framework using alternating direction method of multipliers [C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 184-199. [19] MA X L, GUO F M, NIU W, et al. PCONV: The missing but desirable sparsity in DNN weight pruning for real-time execution on mobile devices [C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI-20). 2020: 5117-5124. [20] HE Y, ZHANG X, SUN J. Channel pruning for accelerating very deep neural networks [C]// Proceedings of the IEEE International Conference on Computer Vision. 2017: 1389-1397. [21] CHIN T W, DING R, ZHANG C, et al. Towards efficient model compression via learned global ranking [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 1518-1528. [22] MOLCHANOV P, MALLYA A, TYREE S, et al. Importance estimation for neural network pruning [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 11264-11272. [23] LUO J H, WU J, LIN W. Thinet: A filter level pruning method for deep neural network compression [C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 5058-5066. [24] ZHUANG Z W, TAN M K, ZHUANG B, et al. Discrimination-aware channel pruning for deep neural networks [C]// Proceedings of the 32nd International Conference on Neural Information Processing Systems(NIPS’18). New York: Curran Associates Inc., 2018: 883–894. [25] HE Y, LIU P, WANG Z, et al. Filter pruning via geometric median for deep convolutional neural networks acceleration [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 4340-4349. [26] LIN M, JI R, WANG Y, et al. HRank: Filter pruning using high-rank feature map [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 1529-1538. [27] LIN X, ZHAO C, PAN W. Towards accurate binary convolutional neural network [C]//Advances in Neural Information Processing Systems. 2017: 345-353. [28] LIU Z, WU B, LUO W, et al. Bi-real net: Enhancing the performance of 1-bit cnns with improved representational capability and advanced training algorithm [C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 722-737. [29] HUBARA I, COURBARIAUX M, SOUDRY D, et al. Binarized neural networks [C]//Advances in Neural Information Processing Systems. 2016: 4107-4115. [30] LI F F, ZHANG B, LIU B. Ternary weight networks [EB/OL]. (2016-11-19)[2020-07-03]. https://arxiv.org/pdf/1605.04711.pdf. [31] WANG P, CHENG J. Fixed-point factorized networks [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 4012-4020. [32] BOROUMAND A, GHOSE S, KIM Y, et al. Google workloads for consumer devices: Mitigating data movement bottlenecks [C]//Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems. 2018: 316-331. [33] HAN S, MAO H Z, DALLY W J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding [EB/OL]. (2015-11-20)[2020-07-03]. https://arxiv.org/pdf/1510.00149v3.pdf. [34] CHEN W, WILSON J, TYREE S, et al. Compressing neural networks with the hashing trick [C]// International Conference on Machine Learning. 2015: 2285-2294. [35] STOCK P, JOULIN A, GRIBONVAL R, et al. And the bit goes down: Revisiting the quantizetion of neural networks [EB/OL]. (2019-12-20)[2020-07-02]. https://arxiv.org/pdf/1907.05686.pdf. [36] CARREIRA-PERPINÁN M A, IDELBAYEV Y. Model compression as constrained optimization, with application to neural nets. Part Ⅱ: Quantization [EB/OL]. (2017-07-13)[2020-07-03]. https://arxiv.org/pdf/1707.04319.pdf. [37] ZHU S, DONG X, SU H. Binary ensemble neural network: More bits per network or more networks per bit? [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 4923-4932. [38] WANG Z, LU J, TAO C, et al. Learning channel-wise interactions for binary convolutional neural networks [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 568-577. [39] LIU C, DING W, XIA X, et al. Circulant binary convolutional networks: Enhancing the performance of 1-bit dcnns with circulant back propagation [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 2691-2699. [40] COURBARIAUX M, BENGIO Y, DAVID J P. BinaryConnect: Training deep neural networks with binary weights during propagations [C]//Advances in Neural Information Processing Systems. 2015: 3123-3131. [41] QIN H T, GONG R H, LIU X L, et al. Forward and backward information retention for accurate binary neural networks [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2020: 2247-2256. [42] WANG P, HU Q, ZHANG Y, et al. Two-step quantization for low-bit neural networks [C]// Proceedings of the IEEE Conference on computer vision and pattern recognition. 2018: 4376-4384. [43] MELLEMPUDI N, KUNDU A, MUDIGERE D, et al. Ternary neural networks with fine-grained quantization [EB/OL]. (2017-05-30)[2020-07-03]. https://arxiv.org/pdf/1705.01462.pdf. [44] ZHU F, GONG R, YU F, et al. Towards unified int8 training for convolutional neural network [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 1969-1979. [45] RASTEGARI M, ORDONEZ V, REDMON J, et al. Xnor-net: Imagenet classification using binary convolutional neural networks [C]//European Conference on Computer Vision. Cham: Springer, 2016: 525-542. [46] HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network [EB/OL]. (2015-03-09)[2020-07-04]. https://arxiv.org/pdf/1503.02531.pdf. [47] TIAN Y L, KRISHNAN D, ISOLA P. Contrastive representation distillation [EB/OL]. (2020-01-18)[2020-07-04]. https://arxiv.org/pdf/1910.10699.pdf. [48] FURLANELLO T, LIPTON Z C, TSCHANNEN M, et al. Born again neural networks [C]//Proceedings of the 35th International Conference on Machine Learning. 2020: 1602-1611. [49] GAO M Y, SHEN Y J, LI Q Q, et al. Residual knowledge distillation [EB/OL]. (2020-02-21)[2020-07-04]. https://arxiv.org/pdf/2002.09168.pdf. [50] HE T, SHEN C, TIAN Z, et al. Knowledge adaptation for efficient semantic segmentation [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 578-587. [51] LIN M, CHEN Q, YAN S C. Network in network [EB/OL]. (2014-03-04)[2020-07-04]. https://arxiv.org/pdf/1312.4400/. [52] IANDOLA F N, HAN S, MOSKEWICZ M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size [EB/OL]. (2016-11-04)[2020-07-04]. https://arxiv.org/pdf/1602.07360.pdf. [53] HOWARD A G, ZHU M, CHEN B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications [EB/OL]. (2017-04-17)[2020-07-04]. https://arxiv.org/pdf/1704.04861.pdf. [54] SANDLER M, HOWARD A, ZHU M, et al. Mobilenetv2: Inverted residuals and linear bottlenecks [C] //Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 4510-4520. [55] HOWARD A, SANDLER M, CHU G, et al. Searching for mobilenetv3 [C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 1314-1324. [56] ZHANG X, ZHOU X, LIN M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 6848-6856. [57] MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: Practical guidelines for efficient cnn architecture design [C]//Proceedings of the European Conference on Computer Vision (ECCV). 2018: 116-131. [58] HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 7132-7141. [59] HAN K, WANG Y, TIAN Q, et al. GhostNet: More features from cheap operations [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 1580-1589. [60] CHEN Y, FAN H, XU B, et al. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution [C]//Proceedings of the IEEE International Conference on Computer Vision. 2019: 3435-3444. [61] CAMPBELL F W, ROBSON J G. Application of Fourier analysis to the visibility of gratings [J]. The Journal of Physiology, 1968, 197(3): 551-566. [62] HUANG G, LIU S, VAN DER MAATEN L, et al. Condensenet: An efficient densenet using learned group convolutions [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 2752-2761. [63] JADERBERG M, VEDALDI A, ZISSERMAN A. Speeding up convolutional neural networks with low rank expansions [EB/OL]. (2014-05-15)[2020-07-04]. https://arxiv.org/pdf/1405.3866.pdf. [64] POLINO A, PASCANU R, ALISTARH D. Model compression via distillation and quantization [EB/OL]. (2018-02-15)[2020-07-04]. https://arxiv.org/pdf/1802.05668.pdf. [65] 蔡瑞初, 钟椿荣, 余洋, 等. 面向“边缘”应用的卷积神经网络量化与压缩方法 [J]. 计算机应用, 2018, 38(9): 2449-2454 [66] YU X Y, LIU T L, WANG X C, et al. On compressing deep models by low rank and sparse decomposition [C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 7370-7379. [67] CHENG J, WU J X, LENG C, et al. Quantized CNN: A unified approach to accelerate and compress convolutional networks [J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 29(10): 4730-4743. [68] HU H Y, PENG R, TAI Y W, et al. Network trimming: A data-driven neuron pruning approach towards efficient deep architectures [EB/OL]. (2016-07-12)[2020-7-04]. https://arxiv.org/pdf/1607.03250.pdf. [69] WANG R J, LI X, LING C X. Pelee: A real-time object detection system on mobile devices [C]//Advances in Neural Information Processing Systems. 2018: 1963-1972. [70] LI Y, LI J, LIN W, et al. Tiny-DSOD: Lightweight object detection for resource-restricted usages[EB/OL]. (2018-07-29)[2020-07-04]. https://arxiv.org/pdf/1807.11013.pdf. [71] TAN M, PANG R, LE Q V.Efficientdet: Scalable and efficient object detection [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 10781-10790. [72] LI R, WANG Y, LIANG F, et al. Fully quantized network for object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019: 2810-2819.
|