SALD: SIGN AGNOSTIC LEARNING WITH DERIVATIVES

Abstract

Learning 3D geometry directly from raw data, such as point clouds, triangle soups, or unoriented meshes is still a challenging task that feeds many downstream computer vision and graphics applications. In this paper, we introduce SALD: a method for learning implicit neural representations of shapes directly from raw data. We generalize sign agnostic learning (SAL) to include derivatives: given an unsigned distance function to the input raw data, we advocate a novel sign agnostic regression loss, incorporating both pointwise values and gradients of the unsigned distance function. Optimizing this loss leads to a signed implicit function solution, the zero level set of which is a high quality and valid manifold approximation to the input 3D data. The motivation behind SALD is that incorporating derivatives in a regression loss leads to a lower sample complexity, and consequently better fitting. In addition, we provide empirical evidence, as well as theoretical motivation in 2D that SAL enjoys a minimal surface property, favoring minimal area solutions. More importantly, we are able to show that this property still holds for SALD, i.e., with derivatives included. We demonstrate the efficacy of SALD for shape space learning on two challenging datasets: ShapeNet (Chang et al., 2015) that contains inconsistent orientation and non-manifold meshes, and D-Faust (Bogo et al., 2017) that contains raw 3D scans (triangle soups). On both these datasets, we present state-of-the-art results.

1. INTRODUCTION

Recently, neural networks (NN) have been used for representing and reconstructing 3D surfaces. Current NN-based 3D learning approaches differ in two aspects: the choice of surface representation, and the supervision method. Common representations of surfaces include using NN as parametric charts of surfaces (Groueix et al., 2018b; Williams et al., 2019) ; volumetric implicit function representation defined over regular grids (Wu et al., 2016; Tatarchenko et al., 2017; Jiang et al., 2020) ; and NN used directly as volumetric implicit functions (Park et al., 2019; Mescheder et al., 2019; Atzmon et al., 2019; Chen & Zhang, 2019) , referred henceforth as implicit neural representations. Supervision methods include regression of known or approximated volumetric implicit representations (Park et al., 2019; Mescheder et al., 2019; Chen & Zhang, 2019) , regression directly with raw 3D data (Atzmon & Lipman, 2020; Gropp et al., 2020; Atzmon & Lipman, 2020) , and differentiable rendering using 2D data (i.e., images) supervision (Niemeyer et al., 2020; Liu et al., 2019; Saito et al., 2019; Yariv et al., 2020) . The goal of this paper is to introduce SALD, a method for learning implicit neural representations of surfaces directly from raw 3D data. The benefit in learning directly from raw data, e.g., nonoriented point clouds or triangle soups (e.g., Chang et al. (2015) ) and raw scans (e.g., Bogo et al. (2017) ), is avoiding the need for a ground truth signed distance representation of all train surfaces for supervision. This allows working with complex models with inconsistent normals and/or missing parts. In Figure 1 we show reconstructions of zero level sets of SALD learned implicit neural representations of car models from the ShapeNet dataset (Chang et al., 2015) with variational autoencoder; notice the high detail level and the interior, which would not have been possible with, e.g., previous data pre-processing techniques using renderings of visible parts (Park et al., 2019) . Our approach improves upon the recent Sign Agnostic Learning (SAL) method (Atzmon & Lipman, 2020) and shows that incorporating derivatives in a sign agnostic manner provides a significant 1

