基于组对比学习的弱监督三维点云语义分割方法

doi:10.3969/j.issn.1000-5641.2024.02.012

摘要/Abstract

摘要：

三维点云语义分割方法, 是三维视觉环境感知中的重要任务, 被广泛应用于自动驾驶、增强现实、机器人等领域. 然而, 大多数语义分割方法工作在全监督的模式下, 为数据标注带来了极大的压力, 为了解决对于大规模点云标注数据的依赖问题, 许多工作基于有标签数据训练生成伪标签进一步迭代训练模型, 但未考虑到错误伪标签累积所导致的确认偏差. 针对该问题, 本文提出了一种基于组对比学习的弱监督三维点云语义分割方法, 在从伪标签中选择的正例组与负例组之间构造对比学习, 令伪标签之间相互竞争, 减少错误伪标签的梯度贡献, 从而缓解确认偏差. 实验结果表明, 本文所提出的方法在S3DIS、ScanNet-V2、Semantic3D等3个公开数据集上, 相较于目前最优方法均取得了更优的精度.

关键词: 弱监督学习, 三维点云, 语义分割, 对比学习

Abstract:

Three-dimensional point cloud semantic segmentation is an essential task for 3D visual perception and has been widely used in autonomous driving, augmented reality, and robotics. However, most methods work under a fully-supervised setting, which heavily relies on fully annotated datasets. Many weakly-supervised methods have utilized the pseudo-labeling method to retrain the model and reduce the labeling time consumption. However, the previous methods have failed to address the conformation bias induced by false pseudo labels. In this study, we proposed a novel weakly-supervised 3D point cloud semantic segmentation method based on group contrastive learning, constructing contrast between positive and negative sample groups selected from pseudo labels. The pseudo labels will compete with each other within the group contrastive learning, reducing the gradient contribution of falsely predicted pseudo labels. Results on three large-scale datasets show that our method outperforms state-of-the-art weakly-supervised methods with minimal labeling annotations and even surpasses the performance of some classic fully-supervised methods.

Key words: weakly-supervised learning, 3D point cloud, semantic segmentation, contrastive learning

中图分类号:

TP183

郑智鸿, 宋海川. 基于组对比学习的弱监督三维点云语义分割方法[J]. 华东师范大学学报（自然科学版）, 2024, 2024(2): 108-118.

Zhihong ZHENG, Haichuan SONG. Group contrastive learning for weakly-supervised 3D point cloud semantic segmentation[J]. Journal of East China Normal University(Natural Science), 2024, 2024(2): 108-118.

图/表 8

图1

图2

表1

表2

表3

图3

图4

表4

参考文献 36

1	ARMENI I, SENER O, ZAMIR A R, et al. 3D semantic parsing of large-scale indoor spaces [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 1534-1543.
2	DAI A, CHANG A X, SAVVA M, et al. ScanNet: Richly-annotated 3D reconstructions of indoor scenes [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 5828-5839.
3	BEHLEY J, GARBADE M, MILIOTO A, et al. SemanticKITTI: A dataset for semantic scene understanding of LiDAR sequences [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 9297-9307.
4	李勇, 佟国峰, 杨景超, 等.. 三维点云场景数据获取及其场景理解关键技术综述. 激光与光电子学进展, 2019, 56 (4): 21- 34.
5	LI M, XIE Y, SHEN Y, et al. HybridCR: Weakly-supervised 3D point cloud semantic segmentation via hybrid contrastive regularization [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 14930-14939.
6	XU X, LEE G H. Weakly supervised semantic point cloud segmentation: Towards 10 × fewer labels [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 13706-13715.
7	ZHANG Y, QU Y, XIE Y, et al. Perturbed self-distillation: Weakly supervised large-scale point cloud semantic segmentation [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 15520-15528.
8	CHENG M, HUI L, XIE J, et al. SSPC-Net: Semi-supervised semantic 3D point cloud segmentation network [C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2021: 1140-1147.
9	LIU Z, QI X, FU C W. One thing one click: A self-training approach for weakly supervised 3D semantic segmentation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 1726-1736.
10	ZHANG Y, LI Z, XIE Y, et al. Weakly supervised semantic segmentation for large-scale point cloud [C]// Proceedings of the AAAI Conference on Artificial Intelligence. 2021: 3421-3429.
11	HOU J, GRAHAM B, NIEßNER M, et al. Exploring data-efficient 3D scene understanding with contrastive scene contexts [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 15587-15597.
12	XIE S, GU J, GUO D, et al. PointContrast: Unsupervised pre-training for 3D point cloud understanding [C]// Computer Vision–ECCV 2020: 16th European Conference. 2020: 574-591.
13	HACKEL T, SAVINOV N, LADICKY L, et al. Semantic3D.net: A new large-scale point cloud classification benchmark [EB/OL]. (2017-04-12)[2023-01-12]. https://arxiv.org/pdf/1704.03847.pdf.
14	张佳颖, 赵晓丽, 陈正.. 基于深度学习的点云语义分割综述. 激光与光电子学进展, 2020, 57 (4): 28- 46.
15	GUO Y, WANG H, HU Q, et al.. Deep learning for 3D point clouds: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43 (12): 4338- 4364.
16	AUDEBERT N, LE SAUX B, LEFÈVRE S. Semantic segmentation of earth observation data using multimodal and multi-scale deep networks [C]// Computer Vision–ACCV 2016: 13th Asian Conference on Computer Vision. 2017: 180-196.
17	TCHAPMI L, CHOY C, ARMENI I, et al. SEGCloud: Semantic segmentation of 3D point clouds [C]// 2017 International Conference on 3D Vision. 2017: 537-547.
18	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015: 3431-3440.
19	RETHAGE D, WALD J, STURM J, et al. Fully-convolutional point networks for large-scale point clouds [C]// Proceedings of the European Conference on Computer Vision. 2018: 596-611.
20	QI C R, SU H, MO K, et al. PointNet: Deep learning on point sets for 3D classification and segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 652-660.
21	QI C R, YI L, SU H, et al. PointNet++: Deep hierarchical feature learning on point sets in a metric space [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017: 5105-5114.
22	JIANG M Y, WU Y R, ZHAO T Q, et al. PointSIFT: A sift-like network module for 3D point cloud semantic segmentation [EB/OL]. (2018-11-24)[2023-01-10]. https://arxiv.org/pdf/1807.00652.pdf.
23	HU Q, YANG B, XIE L, et al. RandLA-Net: Efficient semantic segmentation of large-scale point clouds [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 11108-11117.
24	YANG J, ZHANG Q, NI B, et al. Modeling point clouds with self-attention and gumbel subset sampling [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 3323-3332.
25	THOMAS H, QI C R, DESCHAUD J E, et al. KPConv: Flexible and deformable convolution for point clouds [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 6411-6420.
26	LI Y, BU R, SUN M, et al. PointCNN: Convolution on $ \chi $-transformed points [EB/OL]. (2018-11-05)[2023-01-09]. https://arxiv.org/pdf/1801.07791.pdf.
27	WU B, WAN A, YUE X, et al. SqueezeSeg: Convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud [C]// 2018 IEEE International Conference on Robotics and Automation. 2018: 1887-1893.
28	WU B, ZHOU X, ZHAO S, et al. SqueezeSegV2: Improved model structure and unsupervised domain adaptation for road-object segmentation from a LiDAR point cloud [C]// 2019 International Conference on Robotics and Automation. 2019: 4376-4382.
29	MENG H Y, GAO L, LAI Y K, et al. VV-Net: Voxel VAE net with group convolutions for point cloud segmentation [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 8500-8508.
30	WEI J, LIN G, YAP K H, et al. Multi-path region mining for weakly supervised 3D semantic segmentation on point clouds [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020: 4384-4393.
31	WANG H Y, RONG X J, YANG L , et al. Towards Weakly Supervised Semantic Segmentation in 3D Graph-Structured Point Clouds of Wild Scenes [C]// 30th British Machine Vision Conference 2019 , BMVC. 2019: 284.
32	JIANG L, SHI S, TIAN Z, et al. Guided point contrastive learning for semi-supervised point cloud semantic segmentation [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021: 6423-6432.
33	WANG X, GAO J, LONG M, et al. Self-tuning for data-efficient deep learning [C]// Proceedings of the International Conference on Machine Learning. 2021: 10738-10748.
34	LEI H, AKHTAR N, MIAN A.. Spherical kernel for efficient graph convolution on 3D point clouds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 43 (10): 3664- 3680.
35	WU W, QI Z, LI F X. PointConv: Deep convolutional networks on 3D point clouds [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019: 9621-9630.
36	REN Z, MISRA I, SCHWING A G, et al. 3D spatial recognition without spatially labeled 3D [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021: 13204-13213.

全/弱监督	方法	标签量	mIoU/%
全监督	PointNet^[20]	$ 100\% $	$ 41.1 $
	SPH3D^[34]	$ 100\% $	$ 59.5 $
	PointConv^[35]	$ 100\% $	$ 50.3 $
	KPConv^[25]	$ 100\% $	$ 67.1 $
	RandLA-Net^[23]	$ 100\% $	$ 63.0 $
弱监督	Xu等^[6]	$ 10\% $	$ 48.0 $
	GPCL^[32]	$ 5\% $	$ 53.0 $
	SSPC-Net^[8]	$ 0.1‰ $	$ 51.5 $
	Zhang等^[10]	$ 1\% $	$ 61.8 $
	Zhang等^[10]	$ 1\;\mathrm{p}\mathrm{t} $	$ 45.8 $
	PSD^[7]	$ 1\% $	$ 63.5 $
	PSD^[7]	$ 1\;\mathrm{p}\mathrm{t} $	$ 48.2 $
	HybridCR^[5]	$ 1\;\mathrm{p}\mathrm{t} $	$ 51.5 $
	本文方法	$ 1\;\mathrm{p}\mathrm{t} $	$ 54.4 $
	本文方法	$ 10\;\mathrm{p}\mathrm{t} $	$ 61.2 $
	本文方法	$ 100\;\mathrm{p}\mathrm{t} $	$ 63.8 $

全/弱监督	方法	标签量	mIoU/%
全监督	PointConv^[35]	$ 100\% $	$ 55.6 $
	SPH3D^[34]	$ 100\% $	$ 61.0 $
	KPConv^[25]	$ 100\% $	$ 68.4 $
	RandLA-Net^[23]	$ 100\% $	$ 64.5 $
弱监督	MPRM^[30]	$ \mathrm{s}\mathrm{u}\mathrm{b}{\text{-}}\mathrm{c}\mathrm{l}\mathrm{o}\mathrm{u}\mathrm{d} $	$ 41.1 $
	GPCL^[32]	$ 5\% $	$ 54.8 $
	WyPR^[36]	$ \mathrm{s}\mathrm{u}\mathrm{b}{\text{-}}\mathrm{c}\mathrm{l}\mathrm{o}\mathrm{u}\mathrm{d} $	$ 24.0 $
	SSPC-Net^[8]	$ 0.1‰ $	$ 27.1 $
	Zhang等^[10]	$ 1\% $	$ 51.1 $
	PSD^[7]	$ 1\% $	$ 54.7 $
	HybridCR^[5]	$ 1\% $	$ 56.8 $
	本文方法	$ 1\;\mathrm{p}\mathrm{t} $	$ 48.8 $
	本文方法	$ 10\;\mathrm{p}\mathrm{t} $	$ 58.2 $
	本文方法	$ 100\;\mathrm{p}\mathrm{t} $	$ 59.1 $

全/弱监督	方法	标签量	mIoU/%
全监督	KPConv^[25]	$ 100\% $	$ 74.6 $
全监督	RandLA-Net^[23]	$ 100\% $	$ 77.4 $
弱监督	Zhang等^[10]	$ 1\% $	$ 72.6 $
	PSD^[7]	$ 1\% $	$ 75.8 $
	HybridCR^[5]	$ 1\% $	$ 76.8 $
	本文方法	$ 1\;\mathrm{p}\mathrm{t} $	$ 29.3 $
	本文方法	$ 10\;\mathrm{p}\mathrm{t} $	$ 57.3 $
	本文方法	$ 100\;\mathrm{p}\mathrm{t} $	$ 72.7 $

模型	基线方法	伪标签	动态阈值	正负双分支	组对比学习	mIoU/%
#1	√					$ 40.8 $
#2	√	√				$ 51.2 $
#3	√	√	√			$ 52.3 $
#4	√	√	√	√		$ 53.0 $
#5	√	√	√	√	√	$ 54.4 $

[1]	任俊霖, 王欢, 黄骁迪, 李艳婷, 琚生根. 基于序列感知与多元行为数据的MOOCs知识概念推荐[J]. 华东师范大学学报（自然科学版）, 2024, 2024(5): 45-56.
[2]	王畅, 马丹, 许华容, 陈攀峰, 陈梅, 李晖. SA-MGKT: 基于自注意力融合的多图知识追踪方法[J]. 华东师范大学学报（自然科学版）, 2024, 2024(5): 20-31.
[3]	卢欣, 黄昶, 金志伟. 基于转台的多视角多姿态锁销点云模型重建[J]. 华东师范大学学报（自然科学版）, 2024, 2024(2): 86-96.
[4]	金志伟, 黄昶, 祝瑞红. 基于高重叠度视角的锁销点云模型的快速建立[J]. 华东师范大学学报（自然科学版）, 2023, 2023(2): 95-105.