SALD: SIGN AGNOSTIC LEARNING WITH DERIVATIVES

Abstract

Learning 3D geometry directly from raw data, such as point clouds, triangle soups, or unoriented meshes is still a challenging task that feeds many downstream computer vision and graphics applications. In this paper, we introduce SALD: a method for learning implicit neural representations of shapes directly from raw data. We generalize sign agnostic learning (SAL) to include derivatives: given an unsigned distance function to the input raw data, we advocate a novel sign agnostic regression loss, incorporating both pointwise values and gradients of the unsigned distance function. Optimizing this loss leads to a signed implicit function solution, the zero level set of which is a high quality and valid manifold approximation to the input 3D data. The motivation behind SALD is that incorporating derivatives in a regression loss leads to a lower sample complexity, and consequently better fitting. In addition, we provide empirical evidence, as well as theoretical motivation in 2D that SAL enjoys a minimal surface property, favoring minimal area solutions. More importantly, we are able to show that this property still holds for SALD, i.e., with derivatives included. We demonstrate the efficacy of SALD for shape space learning on two challenging datasets: ShapeNet (Chang et al., 2015) that contains inconsistent orientation and non-manifold meshes, and D-Faust (Bogo et al., 2017) that contains raw 3D scans (triangle soups). On both these datasets, we present state-of-the-art results.

1. INTRODUCTION

Recently, neural networks (NN) have been used for representing and reconstructing 3D surfaces. Current NN-based 3D learning approaches differ in two aspects: the choice of surface representation, and the supervision method. Common representations of surfaces include using NN as parametric charts of surfaces (Groueix et al., 2018b; Williams et al., 2019) ; volumetric implicit function representation defined over regular grids (Wu et al., 2016; Tatarchenko et al., 2017; Jiang et al., 2020) ; and NN used directly as volumetric implicit functions (Park et al., 2019; Mescheder et al., 2019; Atzmon et al., 2019; Chen & Zhang, 2019) , referred henceforth as implicit neural representations. Supervision methods include regression of known or approximated volumetric implicit representations (Park et al., 2019; Mescheder et al., 2019; Chen & Zhang, 2019) , regression directly with raw 3D data (Atzmon & Lipman, 2020; Gropp et al., 2020; Atzmon & Lipman, 2020) , and differentiable rendering using 2D data (i.e., images) supervision (Niemeyer et al., 2020; Liu et al., 2019; Saito et al., 2019; Yariv et al., 2020) . The goal of this paper is to introduce SALD, a method for learning implicit neural representations of surfaces directly from raw 3D data. The benefit in learning directly from raw data, e.g., nonoriented point clouds or triangle soups (e.g., Chang et al. (2015) ) and raw scans (e.g., Bogo et al. ( 2017)), is avoiding the need for a ground truth signed distance representation of all train surfaces for supervision. This allows working with complex models with inconsistent normals and/or missing parts. In Figure 1 we show reconstructions of zero level sets of SALD learned implicit neural representations of car models from the ShapeNet dataset (Chang et al., 2015) with variational autoencoder; notice the high detail level and the interior, which would not have been possible with, e.g., previous data pre-processing techniques using renderings of visible parts (Park et al., 2019) . Our approach improves upon the recent Sign Agnostic Learning (SAL) method (Atzmon & Lipman, 2020) and shows that incorporating derivatives in a sign agnostic manner provides a significant improvement in surface approximation and detail. SAL is based on the observation that given an unsigned distance function h to some raw 3D data X ⊂ R 3 , a sign agnostic regression to h will introduce new local minima that are signed versions of h; in turn, these signed distance functions can be used as implicit representations of the underlying surface. In this paper we show how the sign agnostic regression loss can be extended to compare both function values h and derivatives ∇h, up to a sign. The main motivation for performing NN regression with derivatives is that it reduces the sample complexity of the problem (Czarnecki et al., 2017) , leading to better accuracy and generalization. For example, consider a one hidden layer NN of the form f (x) = max {ax, bx}+c. Prescribing two function samples at {-1, 1} are not sufficient for uniquely determining f , while adding derivative information at these points determines f uniquely. We provide empirical evidence as well as theoretical motivation suggesting that both SAL and SALD possess the favorable minimal surface property (Zhao et al., 2001) , that is, in areas of missing parts and holes they will prefer zero level sets with minimal area. We justify this property by proving that, in 2D, when restricted to the zero level-set (a curve in this case), the SAL and SALD losses would encourage a straight line solution connecting neighboring data points. We have tested SALD on the dataset of man-made models, ShapeNet (Chang et al., 2015) , and human raw scan dataset, D-Faust (Bogo et al., 2017) , and compared to state-of-the-art methods. In all cases we have used the raw input data X as is and considered the unsigned distance function to X , i.e., h X , in the SALD loss to produce an approximate signed distance function in the form of a neural network. Comparing to state-of-the-art methods we find that SALD achieves superior results on this dataset. On the D-Faust dataset, when comparing to ground truth reconstructions we report state-of-the-art results, striking a balance between approximating details of the scans and avoiding overfitting noise and ghost geometry. Summarizing the contributions of this paper: • Introducing sign agnostic learning with derivatives. • Identifying and providing a theoretical justification for the minimal surface property of sign agnostic learning in 2D. • Training directly on raw data (end-to-end) including unoriented or not consistently oriented triangle soups and raw 3D scans.

2. PREVIOUS WORK

Learning 3D shapes with neural networks and 3D supervision has shown great progress recently. We review related works, where we categorize the existing methods based on their choice of 3D surface representation.



Figure 1: Learning the shape space of ShapeNet (Chang et al., 2015) cars directly from raw data using SALD. Note the interior details; top row depicts SALD reconstructions of train data, and bottom row SALD reconstructions of test data.

