FAST 3D ACOUSTIC SCATTERING VIA DISCRETE LAPLACIAN BASED IMPLICIT FUNCTION ENCODERS

Abstract

Acoustic properties of objects corresponding to scattering characteristics are frequently used for 3D audio content creation, environmental acoustic effects, localization and acoustic scene analysis, etc. The numeric solvers used to compute these acoustic properties are too slow for interactive applications. We present a novel geometric deep learning algorithm based on discrete-laplacian and implicit encoders to compute these characteristics for general 3D objects at interactive rates. We use a point cloud approximation of each object, and each point is encoded in a high-dimensional latent space. Our multi-layer network can accurately estimate these acoustic properties for arbitrary topologies and takes less than 1ms per object on a NVIDIA GeForce RTX 2080 Ti GPU. We also prove that our learning method is permutation and rotation invariant and demonstrate high accuracy on objects that are quite different from the training data. We highlight its application to generating environmental acoustic effects in dynamic environments.

1. INTRODUCTION

Acoustic scattering corresponds to the disturbance of a given incident sound field due to an object's shape and surface properties. It can be regarded as one of the fundamental characteristics of an object. The effect of scattering can be expressed in terms of a scattered sound field, which satisfies Sommerfield's radiation condition. There is considerable work on modeling and measuring the acoustic scattering properties in physics and acoustics and these characteristics are widely used for sound rendering in games and virtual reality (Mehra et al., 2015; Rungta et al., 2018) , noise analysis in indoor scenes (Morales & Manocha, 2018) , acoustic modeling of concert halls (Shtrepi et al., 2015) , non-line-of-sight (NLOS) imaging (Lindell et al., 2019) , understanding room shapes (Dokmanić et al., 2013 ), receiver placement (Morales et al., 2019) , robot sound source localization (An et al., 2019 ), 3D mapping (Kim et al., 2020 ), audio-visual analysis (Sterling et al., 2018) , etc. Acoustic scattering of objects can be modeled accurately using the theory of wave acoustics (Kuttruff, 2016) . The scattering characteristics of objects are widely used for sound propagation, which reduces to solving the wave equation in large environments. Given a sound source location and its vibration patterns, acoustic simulation methods are used to predict the perceived sound at another specified location considering the medium it passes through and objects/boundaries it interacts with. While the wave behavior of sound is well understood in physics, it is much more difficult to compute acoustic scattering and sound propagation effects, especially for higher frequencies. Even with state-of-the-art acoustic wave solvers, it can take from hours to days to solve a moderately modeled room environment on a powerful workstation. One of the contributing factors to this difficulty is that wave behaviors are frequency dependent, so many frequency bands need to be analyzed separately. Current methods for computing the acoustic scattering characteristics can use numeric solvers like boundary-element methods (BEM). However, their complexity increases as a cubic function of the frequencies and most current implementations are limited to static scenes or environments. No good or practical solutions are known to compute the acoustic scattering properties for dynamic environments or when objects move or undergo deformation. Main Results: We present novel techniques based on geometric deep learning on differential coordinates to approximate the acoustic scattering properties of arbitrary objects. Our approach is general and makes no assumption about object's shape, genus, or rigidity. We approximate the objects using point-clouds, and each point in the point cloud representation is encoded in a highdimensional latent space. Moreover, the local surface shapes in the latent space are encoded using implicit surfaces. This enables us to handle arbitrary topology. Our network takes the point cloud as an input and outputs the spherical harmonic coefficients that represent the acoustic scattering field. We present techniques to generate the ground truth data using an accurate wave-solver on a large geometric dataset. We have evaluated the performance on thousands of objects that are very different from the training database (with varying convexity, genus, shape, size, orientation) and observe high accuracy. We also perform an ablation study to highlight the benefits of our approach. The additional runtime overhead of estimating the scattering field from neural networks is less than 1ms per object on a NVIDIA GeForce RTX 2080 Ti GPU. We also prove that our learning method is permutation and rotation invariant, which is an important characteristic for accurate computation of acoustic scattering fields.

2.1. WAVE-ACOUSTIC SIMULATION

Wave-acoustic simulation methods aim to solve the wave equation that governs the propagation and scattering of sound waves. Some conventionally used numeric methods include the finite-element method (Thompson, 2006) , the boundary-element method (Wrobel & Kassab, 2003) , and the finitedifference time domain (Botteldooren, 1995) . The common requirement of these methods is the proper discretization of the problem domain (e.g., spatial resolution, time resolution), which means the time complexity of them will scale drastically with the simulation frequency. Recent works manage to speed up wave simulations by parallelized rectangular decomposition (Morales et al., 2015) or by using pre-computation structures (Mehra et al., 2015) . An alternative way to use wave simulation results in real-time applications is to pre-compute a large amount of sound fields in a scene and compressing the results using perceptual metrics (Raghuvanshi & Snyder, 2014) . While pre-computation methods can save significant runtime cost, they still require non-trivial pre-computation efforts for unseen scenarios, restricting their use cases.

2.2. LEARNING-BASED ACOUSTICS

Machine learning techniques have been widely applied in many popular computer science research areas. There is considerable work on developing machine learning methods for applications corresponding to audio-visual analysis (Zhang et al., 2017) and acoustic scene classification (Abeßer, 2020) (see Bianco et al. (2019) for a thorough list of work). In comparison, there are much fewer works studying the generation of acoustic data from a physical perspective. Therefore, methodologies from existing popular domains often do not directly apply to our problem of interest. Some more relevant works focus on estimating room acoustics parameters from recorded signals (Eaton et al., 2016; Genovese et al., 2019; Tsokaktsidis et al., 2019; Tang et al., 2020) , which can help physicallybased simulators to model real-world acoustics more faithfully. Pulkki & Svensson (2019) propose to use a neural network to model the acoustic scattering effect from rectangular plates without running simulation. This is closely related to our goal of bypassing expensive wave simulation while generating plausible sound, although we aim to model objects of more general shapes. Recently, Fan et al. ( 2020) train convolutional neural networks (CNNs) to learn to map planar sound fields induced by convex scatterers in 2D, which provides promising results. Motivated by the last method, we aim to develop networks that can deal with object geometry in 3D and extend the generality of this learning-based approach.

2.3. GEOMETRIC DEEP LEARNING AND SHAPE REPRESENTATION

There is considerable recent work on generating plausible shape representations for 3D data, including voxel-based (Zhou & Tuzel, 2018; Sindagi et al., 2019; Meng et al., 2019; Wu et al., 2015 ), point-based (Charles et al., 2017; Qi et al., 2017; Wang et al., 2019; Li et al., 2018a; Monti et al., 2017; Li et al., 2018b; Yi et al., 2017) and mesh-based (Hanocka et al., 2019) shape representations. This includes work on shape representation by learning implicit surfaces on point clouds (Smirnov et al., 2019) , designing a mesh Laplacian for convolution (Tan et al., 2018) , hierarchical graph

