LIGHT SAMPLING FIELD AND BRDF REPRESENTA-TION FOR PHYSICALLY-BASED NEURAL RENDERING

Abstract

Physically-based rendering (PBR) is key for immersive rendering effects used widely in the industry to showcase detailed realistic scenes from computer graphics assets. A well-known caveat is that producing the same is computationally heavy and relies on complex capture devices. Inspired by the success in quality and efficiency of recent volumetric neural rendering, we want to develop a physically-based neural shader to eliminate device dependency and significantly boost performance. However, no existing lighting and material models in the current neural rendering approaches can accurately represent the comprehensive lighting models and BRDFs properties required by the PBR process. Thus, this paper proposes a novel lighting representation that models direct and indirect light locally through a light sampling strategy in a learned light sampling field. We also propose BRDF models to separately represent surface/subsurface scattering details to enable complex objects such as translucent material (i.e., skin, jade). We then implement our proposed representations with an end-to-end physically-based neural face skin shader, which takes a standard face asset (i.e., geometry, albedo map, and normal map) and an HDRI for illumination as inputs and generates a photo-realistic rendering as output. Extensive experiments showcase the quality and efficiency of our PBR face skin shader, indicating the effectiveness of our proposed lighting and material representations.

1. INTRODUCTION

Physically-based rendering (PBR) provides a shading and rendering method to accurately represent how light interacts with objects in virtual 3D scenes. Whether working with a real-time rendering system in computer graphics or film production, employing a PBR process will facilitate the creation of images that look like they exist in the real world for a more immersive experience. Industrial PBR pipelines take the guesswork out of authoring surface attributes like transparency since their methodology and algorithms are based on physically accurate formulae and resemble real-world materials. This process relies on onerous artist tuning and high computational power in a long production cycle. In recent years, academia has shown incredible success using differentiable neural rendering in extensive tasks such as view synthesis (Mildenhall et al., 2020) , inverse rendering (Zhang et al., 2021a) , and geometry inference (Liu et al., 2019) . Driven by the efficiency of neural rendering, a natural next step would be to marry neural rendering and PBR pipelines. However, none of the existing neural rendering representations supports the accuracy, expressiveness, and quality mandated by the industrial PBR process. A PBR workflow models both specular reflections, which refers to light reflected off the surface, and diffusion or subsurface scattering, which describes the effects of light absorbed or scattered internally. Pioneering works of differentiable neural shaders such as Softras (Liu et al., 2019) adopted the Lambertian model as BRDF representation, which only models the diffusion effects and results in low-quality rendering. NeRF (Mildenhall et al., 2020) proposed a novel radiance field representation for realistic view-synthesis under an emit-absorb lighting transport assumption without explicitly modeling BRDFs or lighting, and hence is limited to a fixed static scene with no scope for relighting. In follow-up work, NeRV (Srinivasan et al., 2020) took one more step by explicitly modeling directional light, albedo, and visibility maps to make the fixed scene relightable. The indirect illumination was achieved by ray tracing under the assumption of one bounce of incoming light. However, this lighting model is computationally very heavy for real-world environment illumination when more than one incoming directional lights exist. To address this problem, NeRD (Boss et al., 2020) and PhySG (Zhang et al., 2021a) employ a low-cost global environment illumination modeling method using spherical gaussian (SG) to extract parameters from HDRIs. Neural-PIL (Boss et al., 2021) further proposed a pre-trained light encoding network for a more detailed global illumination representation. However, it is still a global illumination representation assuming the same value for the entire scene, which is not true in the real world, where illumination is subjected to shadows and indirect illumination bouncing off of objects in different locations in the scene. Thus it's still an approximation but not an accurate representation of the environmental illumination. Regarding material (BRDF) modeling, all the current works adopt the basic rendering parameters (such as albedo, roughness, and metalness) defined in the rendering software when preparing the synthetic training data. However, they will fail in modeling intricate real-world objects such as participating media (e.g., smoke, fog) and translucent material (organics, skins, jade), where high scattering and subsurface scattering cannot be ignored. Such objects require more effort and hence attract more interest in research in their traditional PBR process. In this work, we aim to design accurate, efficient lighting/ illumination and BRDF representations to enable the neural PBR process, which will support high-quality and photo-realistic rendering in a fast and lightweight manner. To achieve this goal, we propose a novel lighting representationa Light Sampling Field to model both the direct and indirect illumination from HDRI environment maps. Our Light Sampling Field representation faithfully captures the direct illumination (incoming from light sources) and indirect illumination (summary of all indirect incoming lighting from surroundings) given an arbitrary sampling location in a continuous field. Accordingly, we propose BRDF representations in the format of surface specular, surface diffuse, and subsurface scattering for modeling real-world object material. This paper mainly evaluates the proposed representations with a novel volumetric neural physically-based shader for human facial skin. We trained with an extensive high-quality database, including real captured ground truth images as well as synthetic images for illumination augmentation. We also introduce a novel way of integrating surface normals into volumetric rendering for higher fidelity. Coupled with proposed lighting and BRDFs models, our light transport module delivers pore-level realism in both on-and underneath-surface appearance unprecedentedly. Experiments show that our Light Sampling Field is robust enough to learn illumination by local geometry. Such an effect usually can only be modeled by ray tracing. Therefore, our method compromises neither efficiency nor quality with the Light Sampling Field when compared to ray tracing. The main contributions of this paper are as follows: 1) A novel volumetric lighting representation that accurately encodes the direct and indirect illumination positionally and dynamically given an environment map. Our local representation enables efficient modeling of complicated shading effects such as inter-reflectance in neural rendering for the first time as far as we are aware. 2) A BRDF measurement representation that supports the PBR process by modeling specular, diffuse, and subsurface scattering separately. 3) A novel and lightweight neural PBR face shader that takes facial skin assets and environment maps (HDRIs) as input and efficiently renders photo-realistic, highfidelity, and accurate images comparable to industrial traditional PBR pipelines such as Maya. Our face shader is trained with an image database consisting of extensive identities and illuminations. Once trained, our models will extract lighting models and BRDFs from input assets, which works well for novel subjects/ illumination maps. Experiments show that our PBR face shader significantly outperforms the state-of-the-art neural face rendering approaches with regard to quality and accuracy, which indicates the effectiveness of the proposed lighting and material representations.

2. RELATED WORK

Volumetric Neural Rendering Volumetric rendering models the light interactions with volume densities of absorbing, glowing, reflecting, and scattering materials (Max, 1995) . A neural volumetric shader trains a model from a set of images and queries rendered novel images. The recent state-of-the-art was summarized in a survey (Tewari et al., 2020) . In addition, Zhang et al. (2019) introduced the radiance field as a differentiable theory of radiative transfer. Neural Radiance Field (NeRF) (Mildenhall et al., 2020) further described scenes as differentiable neural representation and the following raycasting integrated color in terms of the transmittance factor, volume density, and the voxel diffuse color. Extensions to NeRF were developed for better image encoding (Yu et al., 2020) , ray marching (Bi et al., 2020a ), network efficiency (Lombardi et al., 2021; Yariv et al., 2020 ), realistic shading (Suhail et al., 2022) and volumetric radiative decomposition (Bi et al., 2020b; Rebain 

