Original title: SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
Authors: Yuanhui Huang, Wenzhao Zheng, Borui Zhang, Jie Zhou, Jiwen Lu
This article delves into improving 3D occupancy prediction for self-driving cars, crucial for safety. Annotating every point’s occupancy status in 3D space is labor-intensive. Enter SelfOcc, a method that uses video sequences for self-supervised learning of 3D occupancy. It converts images into 3D representation, imposing constraints on this space via signed distance fields. Rendering 2D images from past and future frames becomes a self-supervision signal, refining 3D representations. Employing an MVS-embedded strategy optimizes these representations with multiple depth proposals. Remarkably, SelfOcc outperforms previous methods by 58.7% using a single frame, setting a new standard for surround cameras. It produces high-quality depth and excels in depth synthesis and estimation tasks across various datasets, establishing a state-of-the-art approach. This self-supervised technique revolutionizes 3D occupancy prediction for autonomous vehicles.
Original article: https://arxiv.org/abs/2311.12754