Dual decision adaptive freezing for fast and accurate transfer learning

doi:10.3969/j.issn.1000-5641.2025.06.004

Abstract

Abstract:

With the rapid development of deep learning, model size and accuracy have been increasing. However, in the quest for greater accuracy, large training datasets are often necessary for training, which often slows down training and exacerbates carbon emissions. To address these challenges, researchers have proposed a number of approaches, including transfer learning. However, existing transfer learning methods either fine-tune the entire network or only a part of it, such as the final classifier layer. The former often leads to slow migration training, and the latter reduces the accuracy of migration training. To solve these problems, a dual-decision adaptive freezing (DDAF) method is proposed for the transfer learning process. First, a group decision module is used to decide on the layers of the neural network that may require freezing. Subsequently, a layer decision module is used to reach a decision on these layers and determine the layers to eventually freeze, thereby finally freezing the layers that need to be frozen, to minimize the possibility of erroneous freezing, improve the accuracy of training, and accelerate the speed of transfer learning training. Extensive experiments showed that the proposed method improved training speed by 1.97 times with minimal loss of accuracy compared to the traditional method of fine-tuning the entire network and significantly improved the accuracy by 34.52% with minimal loss of training speed compared to fine-tuning only the last layer.

Key words: deep learning, transfer learning, image classification, model acceleration, adaptive freezing

CLC Number:

TP183

Zefeng HE, Fuke SHEN, Tongquan WEI. Dual decision adaptive freezing for fast and accurate transfer learning[J]. J* E* C* N* U* N* S*, 2025, 2025(6): 29-38.

Figures/Tables 7

Fig.1

Fig.2

Fig.3

Table 1

Fig.4

Fig.5

Table 2

References 22

1	ZHUANG F Z, QI Z Y, DUAN K Y, et al.. A comprehensive survey on transfer learning. Proceedings of the IEEE, 2021, 109 (1): 43- 76.
2	RAZAVIAN A S, AZIZPOUR H, SULLIVAN J, et al. CNN features off-the-shelf: An astounding baseline for recognition [C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 2014: 512-519.
3	HE Y H, ZHANG X Y, SUN J. Channel pruning for accelerating very deep neural networks [C]// 2017 IEEE International Conference on Computer Vision. IEEE, 2017: 1398-1406.
4	WU J X, LENG C, WANG Y H, et al. Quantized convolutional neural networks for mobile devices [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016: 4820-4828.
5	HINTON G, VINYALS O, DEAN J. Distilling the knowledge in a neural network [EB/OL]. (2015-03-09)[2023-11-09]. https://arxiv.org/abs/1503.02531.
6	ZHANG X Y, ZHOU X Y, LIN M X, et al. ShuffleNet: An extremely efficient convolutional neural network for mobile devices [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 6848-6856.
7	CAI H, ZHU L G, HAN S. ProxylessNAS: Direct neural architecture search on target task and hardware [EB/OL]. (2019-02-23)[2023-11-20]. https://arxiv.org/pdf/1812.00332.
8	TANAKA H, KUNIN D, YAMINS D L K, et al. Pruning neural networks without any data by iteratively conserving synaptic flow [C]// Proceedings of the 34th International Conference on Neural Information Processing Systems. ACM, 2020: 6377-6389.
9	VAN AMERSFOORT J, ALIZADEH M, FARQUHAR S, et al. Single shot structured pruning before training [EB/OL]. (2020-07-01)[2023-11-20]. https://arxiv.org/abs/2007.00389.
10	PAN S J, YANG Q.. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2009, 22 (10): 1345- 1359.
11	LIU Y H, AGARWAL S, VENKATARAMAN S. AutoFreeze: Automatically freezing model blocks to accelerate fine-tuning [EB/OL]. (2021-02-02)[2023-11-20]. https://arxiv.org/abs/2102.01386.
12	KORNBLITH S, NOROUZI M, LEE H, et al.. Similarity of neural network representations revisited. International Conference on Machine Learning, 2019, 97, 3519- 3529.
13	DENG J, DONG W, SOCHER R, et al. ImageNet: A large-scale hierarchical image database [C]// 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009: 248-255.
14	OQUAB M, BOTTOU L, LAPTEV I, et al. Learning and transferring mid-level image representations using convolutional neural networks [C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2014: 1717-1724.
15	GUO Y H, SHI H H, KUMAR A, et al. SpotTune: Transfer learning through adaptive fine-tuning [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019: 4800-4809.
16	HE C Y, LI S, SOLTANOLKOTABI M, et al. PipeTransformer: Automated elastic pipelining for distributed training of transformers [EB/OL]. (2021-02-05)[2023-11-23]. https://arxiv.org/abs/2102.03161.
17	KRIZHEVSKY K A, HINTON G. Learning multiple layers of features from tiny images [EB/OL]. (2009-04-08) [2023-11-28]. http://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf.
18	MAJI S, RAHTU E, KANNALA J, et al. Fine-grained visual classification of aircraft [EB/OL]. (2013-06-21)[2023-11-28]. https://arxiv.org/abs/1306.5151.
19	FAN Y, TIAN F, QIN T, et al. Learning what data to learn [EB/OL]. (2017-02-28)[2023-11-28]. https://arxiv.org/abs/1702.08635.
20	NILSBACK M E, ZISSERMAN A. Automated flower classification over a large number of classes [C]// 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing. IEEE, 2008: 722-729.
21	KRAUSE J, STARK M, JIA D, et al. 3D object representations for fine-grained categorization [C]// 2013 IEEE International Conference on Computer Vision Workshops. IEEE, 2013: 554-561.
22	WAH C, BRANSON S, WELINDER P, et al. The caltech-ucsd birds-200-2011 dataset [EB/OL]. (2011-07-01) [2023-09-22]. https://www.vision.caltech.edu/datasets/cub_200_2011.

数据集	方法	准确率/%						速度(倍)
数据集	方法	CIFAR10	CIFAR100	Aircraft	Flowers	Cars	CUB	CIFAR10	CIFAR100	Aircraft	Flowers	Cars	CUB
MobileNetv2	Full	96.35	81.50	80.35	90.81	87.31	77.30	1.00	1.00	1.00	1.00	1.00	1.00
	Last	69.13	46.30	39.00	82.52	44.42	65.21	1.82	2.38	1.02	1.31	1.01	1.08
	Pipe	94.80	79.31	77.40	90.21	84.80	76.82	1.36	1.47	1.05	1.06	1.08	1.06
	Ours	96.11	80.82	79.36	90.20	86.97	76.97	1.61	1.97	1.18	1.23	1.10	1.35
ResNet-50	Full	97.60	84.96	83.17	92.70	90.11	80.60	1.00	1.00	1.00	1.00	1.00	1.00
	Last	74.45	44.13	29.82	83.02	42.26	63.11	1.09	1.05	1.03	1.37	1.02	1.43
	Pipe	96.60	83.82	81.11	92.30	88.87	79.60	1.22	0.86	1.07	1.16	1.04	1.72
	Ours	97.34	84.53	82.09	92.49	89.36	80.31	1.29	1.38	1.17	1.42	1.15	2.21
ResNet-101	Full	97.82	86.25	86.14	92.80	90.82	81.01	1.00	1.00	1.00	1.00	1.00	1.00
	Last	81.21	60.71	38.00	84.61	43.78	66.65	1.58	1.97	1.07	1.22	1.10	1.15
	Pipe	97.40	85.91	85.53	92.32	90.81	80.25	1.34	1.01	1.01	1.00	1.01	1.03
	Ours	97.41	85.92	85.62	92.51	90.82	80.30	1.51	1.03	1.21	1.12	1.13	1.13

方法名	准确率/%				速度(倍)
方法名	CIFAR10	CIFAR100	Aircraft	Flowers	CIFAR10	CIFAR100	Aircraft	Flowers
本文方法	96.11	80.82	79.36	90.20	1.61	1.97	1.18	1.23
去除组决策模块	96.00	80.63	77.86	88.81	1.44	1.67	1.16	1.02
去除层决策模块	95.85	80.30	75.80	87.12	1.74	1.90	1.23	1.21
基于中心核对齐	96.00	80.63	77.86	88.81	1.44	1.67	1.16	1.02
基于梯度	95.47	79.83	78.46	89.92	1.75	2.40	1.32	1.30

[1]	Lishen CHEN, Peng PU, Jianghai QIAN. Electricity theft detection based on transfer learning and attention hybrid neural network [J]. J* E* C* N* U* N* S*, 2025, 2025(6): 19-28.
[2]	Sijia ZHAO, Fanyu HAN, Wei WANG. Research on the GitHub developer geographic location prediction method based on multi-dimensional feature fusion [J]. J* E* C* N* U* N* S*, 2025, 2025(5): 1-13.
[3]	Luping CAO, Yong XIA. Detection of high-resolution high-bandwidth fractional vortex beams under atmospheric turbulence [J]. J* E* C* N* U* N* S*, 2025, 2025(3): 51-60.
[4]	Caidie HUANG, Xinping WANG, Liangyu CHEN, Yong LIU. Research on a knowledge tracking model based on the stacked gated recurrent unit residual network [J]. Journal of East China Normal University(Natural Science), 2022, 2022(6): 68-78.
[5]	Yilin MA, Huiling TAO, Qiwen DONG, Ye WANG. Prediction of remaining useful life of aeroengines based on the Transformer with multi-feature fusion [J]. Journal of East China Normal University(Natural Science), 2022, 2022(5): 219-232.
[6]	Jizhou LI, Xin LIN. Neural architecture search algorithms based on a recursive structure [J]. Journal of East China Normal University(Natural Science), 2022, 2022(4): 31-42.
[7]	Zejie WANG, Chaomin SHEN, Chun ZHAO, Xinmei LIU, Jie CHEN. Recognition of classroom learning behaviors based on the fusion of human pose estimation and object detection [J]. Journal of East China Normal University(Natural Science), 2022, 2022(2): 55-66.
[8]	Bo LIU, Xiaodong BAI, Gengxin ZHANG, Jun SHEN, Jidong XIE, Laiding ZHAO, Tao HONG. Review of deep learning in cognitive radio [J]. Journal of East China Normal University(Natural Science), 2021, 2021(1): 36-52.
[9]	ZHANG Xu, HUANG Dingjiang. Defect detection on aluminum surfaces based on deep learning [J]. Journal of East China Normal University(Natural Science), 2020, 2020(6): 105-114.
[10]	HAN Chengcheng, LI Lei, LIU Tingting, GAO Ming. Approaches for semantic textual similarity [J]. Journal of East China Normal University(Natural Science), 2020, 2020(5): 95-112.
[11]	SHAO Ming-rui, MA Deng-hao, CHEN Yue-guo, QIN Xiong-pai, DU Xiao-yong. Transfer learning based QA model of FAQ using CQA data [J]. Journal of East China Normal University(Natural Sc, 2019, 2019(5): 74-84.
[12]	YANG Kang, HANG Ding-jiang, GAO Ming. A review of machine reading comprehension for automatic QA [J]. Journal of East China Normal University(Natural Sc, 2019, 2019(5): 36-52.
[13]	CHEN Yuan-zhe, KUANG Jun, LIU Ting-ting, GAO Ming, ZHOU Ao-ying. A survey on coreference resolution [J]. Journal of East China Normal University(Natural Sc, 2019, 2019(5): 16-35.
[14]	LIU Heng-yu, ZHANG Tian-cheng, WU Pei-wen, YU Ge. A review of knowledge tracking [J]. Journal of East China Normal University(Natural Sc, 2019, 2019(5): 1-15.
[15]	YE Jian, ZHAO Hui. A public opinion analysis model based on Danmu data monitoring and sentiment classification [J]. Journal of East China Normal University(Natural Sc, 2019, 2019(3): 86-100.