DrivingScene: A Multi-Task Online Feed-Forward 3D Gaussian Splatting Method for Dynamic Driving Scenes

Abstract

Real-time high-fidelity reconstruction of dynamic driving scenes remains challenging under complex motion and sparse views. We propose DrivingScene, an online feed-forward framework that reconstructs 4D dynamic scenes from only two consecutive surround-view images. The core idea is a lightweight residual flow network that predicts non-rigid motion per camera on top of a learned static prior, explicitly modeling dynamics via scene flow. We also introduce a coarse-to-fine training strategy to avoid instabilities common in end-to-end optimization. On nuScenes, DrivingScene produces high-quality depth, scene flow, and 3D Gaussian point clouds online, outperforming prior methods in dynamic reconstruction and novel-view synthesis.

Publication
Accepted by ICASSP 2026