YOLO-S: A new lightweight helmet wearing detection model

doi:10.3969/j.issn.1000-5641.2021.05.012

Abstract

Abstract:

Traditional worker helmet wearing detection models commonly used at construction sites suffer from long processing times and high hardware requirements; the limited number of available training data sets for complex and changing environments, moreover, contributes to poor model robustness. In this paper, we propose a lightweight helmet wearing detection model—named YOLO-S—to address these challenges. First, for the case of unbalanced data set categories, a hybrid scene data augmentation method is used to balance the categories and improve the robustness of the model for complex construction environments; the original YOLOv5s backbone network is changed to MobileNetV2, which reduces the network computational complexity. Second, the model is compressed, and a scaling factor is introduced in the BN layer for sparse training. The importance of each channel is judged, redundant channels are pruned, and the volume of model inference calculations is further reduced; these changes help increase the overall model detection speed. Finally, YOLO-S is achieved by fine-tuning the auxiliary model for knowledge distillation. The experimental results show that the recall rate of YOLO-S is increased by 1.9% compared with YOLOv5s, the mAP of YOLO-S is increased by 1.4% compared with YOLOv5s, the model parameter is compressed to 1/3 of YOLOv5s, the model volume is compressed to 1/4 of YOLOv5s, FLOPs are compressed to 1/3 of YOLOv5s, the reasoning speed is faster than other models, and the portability is higher.

Key words: helmet-wearing detection, data augmentation, model compression, knowledge distillation

CLC Number:

TP391

Hongcheng ZHAO, Xiuxia TIAN, Zesen YANG, Wanrong BAI. YOLO-S: A new lightweight helmet wearing detection model[J]. Journal of East China Normal University(Natural Science), 2021, 2021(5): 134-145.

Figures/Tables 13

Fig.1

Fig.2

Fig.3

Fig.4

Fig.5

Fig.6

Fig.7

Fig.8

Table 1

Table 2

Fig.9

Table 3

Fig.10

References 21

1	边星, 晋良海, 陈雁高, 等. 施工作业人员佩戴安全帽行为意向研究. 中国安全科学学报, 2016, 26 (11): 43- 48.
2	李海元. 浅析工程建设项目投资管理与决策. 商讯, 2020, (18): 154- 155.
3	GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014: 580-587.
4	GIRSHICK R. Fast R-CNN [C]//Proceedings of the IEEE International Conference on Computer Vision. 2015: 1440-1448.
5	REN S, HE K, GIRSHICK R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [C]//Advances in Neural Information Processing Systems. 2015: 91-99.
6	REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection [C]//IEEE Conference on Computer Vision and Pattern Recognition. 2016: 779-788.
7	LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single shot multibox detector [C]//European Conference on Computer Visio. 2016: 21-37.
8	REDMON J, FARHADI A. YOLO9000: Better, faster, stronger [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 7263-7271.
9	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2020, 42 (2): 318- 327.
10	REDMON J, FARHADI A. YOLOv3: An incremental improvement [EB/OL]. (2018-04-08)[2021-06-12]. https://arxiv.org/abs/1804.02767.
11	BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOv4: Optimal speed and accuracy of object detection [EB/OL]. (2020-04-23)[2021-06-12]. https://arxiv.org/abs/2004.10934.
12	施辉, 陈先桥, 杨英. 改进YOLOv3的安全帽佩戴检测方法. 计算机工程与应用, 2019, 55 (11): 213- 220.
13	方明, 孙腾腾, 邵桢. 基于改进YOLOv2的快速安全帽佩戴情况检测. 光学精密工程, 2019, 27 (5): 1196- 1205.
14	乌民雨, 陈晓辉. 一种基于改进YOLOv3的安全帽检测方法. 信息通信, 2020, (6): 12- 14.
15	徐守坤, 王雅如, 顾玉宛, 等. 基于改进区域卷积神经网络的安全帽佩戴检测. 计算机工程与设计, 2020, 41 (5): 1385- 1389.
16	SANDLER M, HOWARD A, ZHU M, et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks [C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2018: 4510-4520.
17	CHU J, GUO Z, LENG L. Object detection based on multi-layer convolution feature fusion and online hard example mining. IEEE Access, 2018, 19959- 19967.
18	IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning. 2015: 448-456.
19	LIU Z, LI J G, SHEN Z Q, et al. Learning efficient convolutional networks through network slimming [C]//Proceedings of 2017 IEEE International Conference on Computer Vision. 2017: 2755-2763.
20	GOU J P, YU B S, MAYBANK S J, et al. Knowledge distillation: A survey [EB/OL]. (2020-06-09)[2021-06-12]. https://arxiv.org/abs/2006.05525.
21	MEHTA R, OZTURK C. Object detection at 200 frames per second [EB/OL]. (2018-05-16)[2021-06-12]. https://arxiv.org/abs/1805.06361.

模型	召回率	平均精度/(mAP@.5)	推理时间/(ms·帧^–1)
SSD 300 (VGG16)	0.764	0.732	26
SSD 512 (VGG16)	0.779	0.751	62
RetinaNet	0.892	0.865	39
Faster R-CNN	0.943	0.929	160
YOLO-S	0.934	0.921	17

模型	召回率	平均精度/(mAP@.5)	模型大小/M	推理时间/(ms·帧^–1)	FLOPs/B
YOLOv3	0.817	0.803	239	27	156
YOLOv3-SPP	0.825	0.811	269	24	157
YOLOv5s	0.915	0.907	52.5	18	17.0
YOLOv5m	0.941	0.938	166	22	51.3
MobileNetV2-YOLOv3	0.797	0.790	167	19	23.4
YOLO-S(PR 80%)	0.788	0.783	10.2	16	4.63
YOLO-S(PR 30%)	0.935	0.924	14.9	19	5.75
YOLO-S(without PR)	0.924	0.917	15.4	19	5.83
YOLO-S(without SP)	0.939	0.930	15.4	19	5.83
YOLO-S(without DA)	0.911	0.902	13.9	17	4.88
YOLO-S(without Dist)	0.899	0.897	11.4	17	4.70
YOLO-S	0.934	0.921	13.9	17	4.88

模型	平台1推理时间/(ms·帧^–1)	平台2推理时间/(ms·帧^–1)
MobileNetV2-YOLOv3	291	311
YOLOv5s	247	289
MobileNetV2-YOLOv5s	240	272
YOLO-S	234	268

[1]	Yang ZHANG, Yejing LAI, Dingjiang HUANG. Device component state recognition method of power distribution cabinet based on a residual networks [J]. Journal of East China Normal University(Natural Science), 2023, 2023(2): 132-142.
[2]	Haojie WU, Yanjie WANG, Wenbing CAI, Fei WANG, Yang LIU, Peng PU, Shaohui LIN. Correlation operation based on intermediate layers for knowledge method [J]. Journal of East China Normal University(Natural Science), 2022, 2022(5): 115-125.
[3]	Xiaoqin MA, Xiaohe GUO, Yufeng XUE, Lin YANG, Yuanzhe CHEN. Data augmentation technology for named entity recognition [J]. Journal of East China Normal University(Natural Science), 2021, 2021(5): 14-23.
[4]	LAI Yejing, HAO Shanfeng, HUANG Dingjiang. Methods and progress in deep neural network model compression [J]. Journal of East China Normal University(Natural Science), 2020, 2020(5): 68-82.