中文核心期刊J* E* C* N* U* N* S* ›› 2025, Vol. 2025 ›› Issue (1): 72-81.doi: 10.3969/j.issn.1000-5641.2025.01.006
• Computer Science • Previous Articles Next Articles
Received:2023-11-25
Online:2025-01-25
Published:2025-01-20
Contact:
Jinhua XU
E-mail:jhxu@cs.ecnu.edu.cn
CLC Number:
Yinshuai JI, Jinhua XU. Surface-height- and uncertainty-based depth estimation for Mono3D[J]. J* E* C* N* U* N* S*, 2025, 2025(1): 72-81.
Table 1
Quantitative results on the KITTI test set for the car category"
| 方法 | t/ms | |||||||
| 简单 | 中等 | 困难 | 简单 | 中等 | 困难 | |||
| PCT[ | – | 21.00 | 13.37 | 11.31 | 29.65 | 19.03 | 15.92 | |
| CaDDN[ | 630 | 19.17 | 13.41 | 11.46 | 27.94 | 18.91 | 17.19 | |
| MonoDLE[ | 40 | 17.23 | 12.26 | 10.29 | 24.79 | 18.89 | 16.00 | |
| MonoDTR[ | 37 | 21.99 | 15.39 | 12.73 | 28.59 | 20.38 | 17.14 | |
| MonoJSG[ | 42 | 24.69 | 16.14 | 13.64 | 32.59 | 21.26 | 18.18 | |
| MonoDistill[ | 40 | 22.97 | 16.03 | 13.60 | 31.87 | 22.59 | 19.72 | |
| DID-M3D[ | 40 | 24.40 | 16.29 | 13.75 | 32.95 | 22.76 | 19.83 | |
| MonoDDE[ | 40 | |||||||
| MonoRCNN[ | 70 | 18.36 | 12.65 | 10.03 | 25.48 | 18.11 | 14.10 | |
| MonoFlex[ | 30 | 19.94 | 13.89 | 12.07 | 28.23 | 19.75 | 16.89 | |
| GUP Net[ | 34 | 22.26 | 15.02 | 13.12 | 30.29 | 21.19 | 18.20 | |
| SHUD | 40 | |||||||
Table 2
Ablation study on the 3D depth heads on KITTI dataset"
| 实验 | |||||||
| 简单 | 中等 | 困难 | |||||
| a | √ | 31.49(22.48) | 23.50(17.11) | 19.86(14.25) | |||
| b | √ | √ | 33.54(24.32) | 25.27(18.22) | 20.81(14.91) | ||
| c | √ | √ | 33.57(24.53) | 25.35(18.39) | 21.07(15.17) | ||
| d | √ | √ | √ | 35.08(25.16) | 25.96(18.82) | 21.45(15.71) | |
| e | √ | √ | √ | √ | 36.41(26.49) | 26.32(19.10) | 21.76(16.04) |
| 1 | ZAMANAKOS G, TSOCHATZIDIS L, AMANATIADIS A, et al.. A comprehensive survey of LIDAR-based 3D object detection methods with deep learning for autonomous driving. Computers & Graphics, 2021, 99, 153- 181. |
| 2 | FAN L, PANG Z Q, ZHANG T Y, et al. Embracing single stride 3D object detector with sparse transformer [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). IEEE, 2022: 8448-8458. |
| 3 | SUN P, TAN M X, WANG W Y, et al. SWFormer: Sparse window transformer for 3D object detection in point clouds [C]// Computer Vision – ECCV 2022, ECCV 2022, Lecture Notes in Computer Science, vol 13670. Cham: Springer, 2022: 426-442. |
| 4 | SHI G S, LI R F, MA C. PillarNet: Real-time and high-performance pillar-based 3D object detection [C]// Computer Vision – ECCV 2022, ECCV 2022, Lecture Notes in Computer Science, vol 13670. Cham: Springer, 2022: 35-52. |
| 5 | CAI Y J, LI B Y, JIAO Z Y, et al. Monocular 3D object detection with decoupled structured polygon estimation and height-guided depth estimation [J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 10478-10485. |
| 6 | SHI X P, YE Q, CHEN X Z, et al. Geometry-based distance decomposition for monocular 3D object detection [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, 2021: 15172-15181. |
| 7 | LU Y, MA X Z, YANG L, et al. Geometry uncertainty projection network for monocular 3D object detection [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, 2021: 3111-3121. |
| 8 | 吉银帅, 续晋华, 孙仕亮. 一种基于目标表面点高度和不确定性的单目深度估计方法: CN116843737A [P]. 2023-10-03. |
| 9 | ZHANG Y P, LU J W, ZHOU J. Objects are different: Flexible monocular 3D object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2021: 3289-3298. |
| 10 | LI Z L, QU Z, ZHOU Y, et al. Diversity matters: Fully exploiting depth clues for reliable monocular 3D object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2022: 2791-2800. |
| 11 | MA X Z, ZHANG Y M, XU D, et al. Delving into localization errors for monocular 3D object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2021: 4721-4730. |
| 12 | LI P X, ZHAO H C, LIU P F, et al. RTM3D: Real-time monocular 3D detection from object keypoints for autonomous driving [C]// Computer Vision – ECCV 2020, ECCV 2020, Lecture Notes in Computer Science, vol 12348. Cham: Springer, 2020: 644-660. |
| 13 | DING M Y, HUO Y Q, YI H W, et al. Learning depth-guided convolutions for monocular 3D object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 2020: 11672-11681. |
| 14 | CHEN X Z, KUNDU K, ZHANG Z Y, et al. Monocular 3D object detection for autonomous driving [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2016: 2147-2156. |
| 15 | BRAZIL G, LIU X M. M3D-RPN: Monocular 3D region proposal network for object detection [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, 2019: 9287-9296. |
| 16 | QIN Z Q, LI X. MonoGround: Detecting monocular 3D objects from the ground [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2022: 3793-3802. |
| 17 | PENG L, WU X P, YANG Z, et al. DID-M3D: Decoupling instance depth for monocular 3D object detection [C]// Computer Vision – ECCV 2022, ECCV 2022, Lecture Notes in Computer Science, vol 13661. Cham: Springer, 2022: 71-88. |
| 18 | SHI S S, WANG X G, LI H S. PointRCNN: 3D object proposal generation and detection from point cloud [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019: 770-779. |
| 19 | ZHOU Y, TUZEL O. VoxelNet: End-to-end learning for point cloud based 3D object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 4490-4499. |
| 20 | LANG A H, VORA S, CAESAR H, et al. PointPillars: Fast encoders for object detection from point clouds [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019: 12697-12705. |
| 21 | RODDICK T, KENDALL A, CIPOLLA R. Orthographic feature transform for monocular 3D object detection [EB/OL]. (2018-11-20)[2023-10-08]. https://doi.org/10.48550/arXiv.1811.08188. |
| 22 | READING C, HARAKEH A, CHAE J, et al. Categorical depth distribution network for monocular 3D object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2021: 8555-8564. |
| 23 | WANG Y, CHAO W L, GARG D, et al. Pseudo-lidar from visual depth estimation: Bridging the gap in 3D object detection for autonomous driving [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2019: 8445-8453. |
| 24 | MA X Z, WANG Z H, LI H J, et al. Accurate monocular 3D object detection via color-embedded 3D reconstruction for autonomous driving [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, 2019: 6851-6860. |
| 25 | CHONG Z Y, MA X Z, ZHANG H, et al. MonoDistill: Learning spatial features for monocular 3D object detection [EB/OL]. (2022-01-26)[2023-10-08]. https://doi.org/10.48550/arXiv.2201.10830. |
| 26 | HU M, WANG S L, LI B, et al. PENet: Towards precise and efficient image guided depth completion [C]// 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021: 13656-13662. |
| 27 | PHUONG M, LAMPERT C H. Towards understanding knowledge distillation [EB/OL]. (2021-05-27)[2023-10-08]. https://doi.org/10.48550/arXiv.2105.13093. |
| 28 | ANGER H O.. Use of a gamma-ray pinhole camera for in vivo studies. Nature, 1952, 170 (4318): 200- 201. |
| 29 | YU F, WANG D Q, SHELHAMER E, et al. Deep layer aggregation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2018: 2403-2412. |
| 30 | MOUSAVIAN A, ANGUELOV D, FLYNN J, et al. 3D bounding box estimation using deep learning and geometry [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2017: 7074-7082. |
| 31 | GEIGER A, LENZ P, STILLER C, et al.. Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research, 2013, 32 (11): 1231- 1237. |
| 32 | KENDALL A, GAL Y. What uncertainties do we need in bayesian deep learning for computer vision? [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, NY, United States: Curran Associates Inc., 2017: 5580–5590. |
| 33 | SIMONELLI A, BULO S R, PORZI L, et al. Disentangling monocular 3D object detection [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, 2019: 1991-1999. |
| 34 | WANG L, ZHANG L, ZHU Y, et al. Progressive coordinate transforms for monocular 3D object detection [C]// Advances in Neural Information Processing Systems 34 (NeurIPS 2021), 2021: 13364-13377. |
| 35 | HUANG K C, WU T H, SU H T, et al. MonoDTR: Monocular 3D object detection with depth-aware transformer [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2022: 4012-4021. |
| 36 | LIAN Q, LI P L, CHEN X Z. MonoJSG: Joint semantic and geometric cost volume for monocular 3D object detection [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 2022: 1070-1079. |
| [1] | Wenjing HU, Longquan JIANG, Junlong YU, Yiqian XU, Qipeng LIU, Lei LIANG, Jiahao LI. Knowledge-distillation-based lightweight crop-disease-recognition algorithm [J]. J* E* C* N* U* N* S*, 2025, 2025(1): 59-71. |
| [2] | Chang WANG, Dan MA, Huarong XU, Panfeng CHEN, Mei CHEN, Hui LI. SA-MGKT: Multi-graph knowledge tracing method based on self-attention [J]. Journal of East China Normal University(Natural Science), 2024, 2024(5): 20-31. |
| [3] | Zhihong ZHENG, Haichuan SONG. Group contrastive learning for weakly-supervised 3D point cloud semantic segmentation [J]. Journal of East China Normal University(Natural Science), 2024, 2024(2): 108-118. |
| [4] | Xinxin HE, Haichuan SONG. Hidden layer Fourier convolution for non-stationary texture synthesis [J]. Journal of East China Normal University(Natural Science), 2024, 2024(2): 119-130. |
| [5] | Lulu JIANG, Siqi SUN, Haidong ZOU, Lina LU, Rui FENG. Diabetic retinopathy grading based on dual-view image feature fusion [J]. Journal of East China Normal University(Natural Science), 2023, 2023(6): 39-48. |
| [6] | Caidie HUANG, Xinping WANG, Liangyu CHEN, Yong LIU. Research on a knowledge tracking model based on the stacked gated recurrent unit residual network [J]. Journal of East China Normal University(Natural Science), 2022, 2022(6): 68-78. |
| [7] | Haojie WU, Yanjie WANG, Wenbing CAI, Fei WANG, Yang LIU, Peng PU, Shaohui LIN. Correlation operation based on intermediate layers for knowledge method [J]. Journal of East China Normal University(Natural Science), 2022, 2022(5): 115-125. |
| [8] | Xueming ZHOU, Dingjiang HUANG. Survey of few-shot instance segmentation methods [J]. Journal of East China Normal University(Natural Science), 2022, 2022(5): 136-146. |
| [9] | CHEN Hai-long, PENG Wei. Research on improved BP neural network in forecasting traffic accidents [J]. Journal of East China Normal University(Natural Sc, 2017, 2017(2): 61-68. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||
