DO 2D GANS KNOW 3D SHAPE? UNSUPERVISED 3D SHAPE RECONSTRUCTION FROM 2D IMAGE GANS



Figure 1 : The first column shows images generated by off-the-shelf 2D GANs trained on RGB images only, while the rest show that our method can unsupervisedly reconstruct 3D shape (viewed in 3D mesh, surface normal, and texture) given a single 2D image by exploiting the geometric cues contained in GANs. The last two columns depicts 3D-aware image manipulation effects (rotation and relighting) enabled by our framework. More results are provided in the Appendix.

ABSTRACT

Natural images are projections of 3D objects on a 2D image plane. While state-ofthe-art 2D generative models like GANs show unprecedented quality in modeling the natural image manifold, it is unclear whether they implicitly capture the underlying 3D object structures. And if so, how could we exploit such knowledge to recover the 3D shapes of objects in the images? To answer these questions, in this work, we present the first attempt to directly mine 3D geometric cues from an offthe-shelf 2D GAN that is trained on RGB images only. Through our investigation, we found that such a pre-trained GAN indeed contains rich 3D knowledge and thus can be used to recover 3D shape from a single 2D image in an unsupervised manner. The core of our framework is an iterative strategy that explores and exploits diverse viewpoint and lighting variations in the GAN image manifold. The framework does not require 2D keypoint or 3D annotations, or strong assumptions on object shapes (e.g. shapes are symmetric), yet it successfully recovers 3D shapes with high precision for human faces, cars, buildings, etc. The recovered 3D shapes immediately allow high-quality image editing like relighting and object rotation. We quantitatively demonstrate the effectiveness of our approach compared to previous methods in both 3D shape reconstruction and face rotation. Our code is available at https://github.com/XingangPan/GAN2Shape.

1. INTRODUCTION

Generative adversarial networks (GANs) (Goodfellow et al., 2014) are capable of modeling the 2D natural image manifold (Zhu et al., 2016) of diverse object categories with high fidelity. Recall the fact that natural images are actually the projections of 3D objects to the 2D plane, an ideal 2D image 1

