MultiPull

Abstract

Reconstructing a continuous surface from a raw 3D point cloud is a challenging task. Recent methods usually train neural networks to overfit on single point clouds to infer signed distance functions (SDFs). However, neural networks tend to smooth local details due to the lack of ground truth signed distances or normals, which limits the performance of overfitting-based methods in reconstruction tasks. To resolve this issue, we propose a novel method, named MultiPull, to learn multi-scale implicit fields from raw point clouds by optimizing accurate SDFs from coarse to fine. We achieve this by mapping 3D query points into a set of frequency features, which makes it possible to leverage multi-level features during optimization. Meanwhile, we introduce optimization constraints from the perspective of spatial distance and normal consistency, which play a key role in point cloud reconstruction based on multi-scale optimization strategies. Our experiments on widely used object and scene benchmarks demonstrate that our method outperforms the state-of-the-art methods in surface reconstruction.

Method

In this paper, We design a neural network to learn an implicit function $f$ from a single 3D point cloud by progressively pulling a set of query points $Q_0$ onto the underlying surface.

Here is an overview of our method. We propse (a) Frequency Feature Transformation (FFT) module and (b) Multi-Step Pulling (MSP) module to learn implicit functions from coarse to fine. In (a), we learn Fourier bases $h_{i}(Q)$ from query points $Q$ using the Fourier layer and obtain multi-level frequency features $ {y_{i}}$ through Hadamard product. In (b), using multi-level frequency features from (a) and a linear network \textbf{LSNN} with shared parameters, we calculate the distance(D) of $Q_ {i}$ to its corresponding surface target point $Q_{t}$ to predict a more accurate surface. We visualize the predicted SDF distribution map corresponding to the frequency features in (a) and the reconstruction from each step of SDF predictions on the right side of (b).

Visualization Results

Visualizations of reconstruction results on object-level and scene-level datasets. The transition from blue to yellow represents increasing reconstruction errors, and the red boxes highlight local details.

Comparison on ShapeNet Dataset

Comparison on FAMOUS Dataset

Comparison on SRB Dataset

Comparison on Thingi10K Dataset

Comparison on D-FAUST Dataset

Comparison on 3D-Scene Dataset

Comparison on KITTI Dataset

BibTeX

@inproceedings{takeshi2024multipull,
	  title = {MultiPull: Detailing Signed Distance Functions by Pulling Multi-Level Queries at Multi-Step},
	  author = {Takeshi Noda and Chao Chen and Weiqi Zhang and, Xinhai Liu and Yu-Shen Liu and Zhizhong Han},
	  booktitle = {Advances in Neural Information Processing Systems},
	  year = {2024}}

MultiPull: Detailing Signed Distance Functions by Pulling Multi-Level Queries at Multi-Step

NeurIPS 2024