华东师范大学学报(自然科学版) ›› 2025, Vol. 2025 ›› Issue (1): 72-81.doi: 10.3969/j.issn.1000-5641.2025.01.006

• 计算机科学 • 上一篇    下一篇

基于表面高度和不确定性的单目3D物体检测

吉银帅, 续晋华*()   

  1. 华东师范大学 计算机科学与技术学院, 上海 200062
  • 收稿日期:2023-11-25 出版日期:2025-01-25 发布日期:2025-01-20
  • 通讯作者: 续晋华 E-mail:jhxu@cs.ecnu.edu.cn

Surface-height- and uncertainty-based depth estimation for Mono3D

Yinshuai JI, Jinhua XU*()   

  1. School of Computer Science and Technology, East China Normal University, Shanghai 200062, China
  • Received:2023-11-25 Online:2025-01-25 Published:2025-01-20
  • Contact: Jinhua XU E-mail:jhxu@cs.ecnu.edu.cn

摘要:

单目3D (three-dimensional)物体检测是自动驾驶和机器人导航中的一项基础但具有挑战性的任务. 直接从单张图片预测深度本质上是一个不适定的问题. 几何投影是一种强大的深度估计方法, 它从物体的物理高度和图像平面中的投影高度推断物体的深度. 然而, 高度估计错误将会放大深度估计的误差. 研究了预测物体表面点的物理高度和投影高度, 而不是物体本身的高度, 由此可获得一系列深度候选值; 还研究了估计高度的不确定性, 并根据不确定性来组合这些深度候选值, 以获得最终的目标深度. 实验证明了此深度估计方法的有效性, 且该方法在KITTI数据集的单目3D目标检测任务上达到了SOTA (state-of-the-art)结果.

关键词: 单目3D物体检测, 深度估计, 几何投影, 自动驾驶

Abstract:

Monocular three-dimensional (3D) object detection is a fundamental but challenging task in autonomous driving and robotic navigation. Directly predicting object depth from a single image is essentially an ill-posed problem. Geometry projection is a powerful depth estimation method that infers an object’s depth from its physical and projected heights in the image plane. However, height estimation errors are amplified by the depth error. In this study, the physical and projected heights of object surface points (rather than the height of the object itself) were estimated to obtain several depth candidates. In addition, the uncertainties in the heights were estimated and the final object depth was obtained by assembling the depth predictions according to the uncertainties. Experiments demonstrated the effectiveness of the depth estimation method, which achieved state-of-the-art (SOTA) results on a monocular 3D object detection task of the KITTI dataset.

Key words: monocular 3D object detection (Mono3D), depth estimation, geometry projection, automatic driving

中图分类号: