GeCoNeRF: Few-shot Neural Radiance Fields via Geometric Consistency
ICML'23
Minseop Kwak*,
Jiuhn Song*,
Seungryong Kim,
Korea University
*Equal contribution Corresponding author
[Paper]
[GitHub]
Qualitative comparison on NeRF-Synthetic show that in 3-view setting, our method captures fine details more robustly and produces less artifacts compared to previous methods, as shown in \(materials\) scene and \(lego\) scene of NeRF-Synthetic dataset.


Abstract

We present a novel framework to regularize Neural Radiance Field (NeRF) in a few-shot setting with a geometry-aware consistency regularization. The proposed approach leverages a rendered depth map at unobserved viewpoint to warp sparse input images to the unobserved viewpoint and impose them as pseudo ground truths to facilitate learning of NeRF. By encouraging such geometry-aware consistency at a feature-level instead of using pixel-level reconstruction loss, we regularize the NeRF at semantic and structural levels while allowing for modeling view-dependent radiance to account for color variations across viewpoints. We also propose an effective method to filter out erroneous warped solutions, along with training strategies to stabilize training during optimization. We show that our model achieves competitive results compared to state-of-the-art few-shot NeRF models.

Overview of our Framework

Given an image \(\mathcal{I}_\mathrm{i}\) and estimated depth map \(\mathcal{D}_\mathrm{j}\) of \(j\)-th unobserved viewpoint, we warp the image \(\mathcal{I}_\mathrm{i}\) to that novel viewpoint as \(I_{i \rightarrow j}\) by establishing geometric correspondence between two viewpoints. Using the warped image as a pseudo ground truth, we cause rendered image of unseen viewpoint, \(\mathcal{I}_\mathrm{j}\), to be consistent in structure with warped image, with occlusions taken into consideration.

Qualitative Results

Comparison on NeRF-Synthetic Datatest

Qualitative comparison results in NeRF-Synthetic dataset demonstrate that our model shows superior performance to baseline mip-NeRF and previous state-of-the-art model, RegNeRF, in 3-view settings. We observe that our warping-based consistency enables GeCoNeRF to capture fine details that mip-NeRF and RegNeRF struggle to capture in same sparse view scenarios, as demonstrated with the \(mic\) scene. Our method also displays higher stability in rendering smooth surfaces and reducing artifacts in background in comparison to previous models, as shown in the results of the \(materials\) scene. We argue that these results demonstrate how our method, through generation of warped pseudo ground truth patches, is able to give the model local, scene-specific regularization that aids recovery of fine details, which previous few-shot NeRF models with their global, generalized priors were unable to accomplish.

Comparison on LLFF Datatest

Qualitative comparison on LLFF dataset with baseline mip-NeRF shows that our model learns of coherent depth and geometry in extremely sparse 3-view setting.

Ablation Results

We validate the design choices in our model by performing both an ablative study. We observe that without the consistency modeling loss, our model suffers a sharp decrease in reconstruction fidelity both quantitatively and qualitatively. We also validate the inclusion of our occlusion mask, progressive modeling method and disparity regularization loss.

Paper and Supplementary Material

M. Kwak, J. Song,
S. Kim
GeCoNeRF: Few-shot Neural Radiance Fields via Geometric Consistency
(hosted on ArXiv)


[Bibtex]


Acknowledgements

This template was originally made by Phillip Isola and Richard Zhang for a colorful ECCV project; the code can be found here.