Abstract

Next Best View (NBV) algorithms aim to acquire an optimal set of images using minimal resources, time, or number of captures to enable efficient 3D reconstruction of a scene. Existing approaches often rely on prior scene knowledge or additional image captures and often develop policies that maximize coverage. Yet, for many real scenes with complex geometry and self-occlusions, coverage maximization does not lead to better reconstruction quality directly. In this paper, we propose the View Introspection Network (VIN), which is trained to predict the reconstruction quality improvement of views directly, and the VIN-NBV policy. A greedy sequential sampling-based policy, where at each acquisition step, we sample multiple query views and choose the one with the highest VIN predicted improvement score. We design the VIN to perform 3D-aware featurization of the reconstruction built from prior acquisitions, and for each query view create a feature that can be decoded into an improvement score. We then train the VIN using imitation learning to predict the reconstruction improvement score. We show that VIN-NBV improves reconstruction quality by ~30% over a coverage maximization baseline when operating with constraints on the number of acquisitions or the time in motion.

VIN-NBV Policy Overview

Overview of the VIN-NBV Policy and the VIN architecture. The VIN is trained to predict the reconstruction improvement of a query view given a set of prior acquisitions. The VIN-NBV policy uses the VIN to select the next best view to acquire. The design of our policy makes it easy to modify with custom termination criteria and decision making logic.

Evaluation Results

We show the final average chamfer distance of our method compared to prior works evaluated on the OmniObject3D houses category for 20 captures. We also graph the average chamfer distance of our method as more acquisitions are made. We show that our method outperforms all prior works and that the chamfer distance improves as more acquisitions are made.

BibTeX

@misc{frahm2025vinnbvviewintrospectionnetwork,
        title={VIN-NBV: A View Introspection Network for Next-Best-View Selection for Resource-Efficient 3D Reconstruction}, 
        author={Noah Frahm and Dongxu Zhao and Andrea Dunn Beltran and Ron Alterovitz and Jan-Michael Frahm and Junier Oliva and Roni Sengupta},
        year={2025},
        eprint={2505.06219},
        archivePrefix={arXiv},
        primaryClass={cs.CV},
        url={https://arxiv.org/abs/2505.06219}, 
  }

VIN-NBV: A View Introspection Network for Next-Best-View Selection for Resource-Efficient 3D Reconstruction

Abstract

VIN-NBV Policy Overview

Evaluation Results

10 Acquisition Comparison

Cov-NBV

VIN-NBV

Time in Motion Comparison

Base View 1

Base View 2

Initial Reconstruction (0s)

Cov-NBV

15 s

30 s

45 s

60 s

VIN-NBV

15 s

30 s

45 s

60 s

BibTeX

VIN-NBV: A View Introspection Network for Next-Best-View Selection for Resource-Efficient 3D Reconstruction

Abstract

VIN-NBV Policy Overview

Evaluation Results

10 Acquisition Comparison

Cov-NBV

VIN-NBV

Time in Motion Comparison

Base View 1

Base View 2

Initial Reconstruction (0s)

Cov-NBV

15 s

30 s

45 s

60 s

VIN-NBV

15 s

30 s

45 s

60 s

BibTeX

15 s

30 s

45 s

60 s

15 s

30 s

45 s

60 s