SKINNING A PARAMETERIZATION OF THREE-DIMENSIONAL SPACE FOR NEURAL NETWORK CLOTH

Abstract

We present a novel learning framework for cloth deformation by embedding virtual cloth into a tetrahedral mesh that parametrizes the volumetric region of air surrounding the underlying body. In order to maintain this volumetric parameterization during character animation, the tetrahedral mesh is constrained to follow the body surface as it deforms. We embed the cloth mesh vertices into this parameterization of three-dimensional space in order to automatically capture much of the nonlinear deformation due to both joint rotations and collisions. We then train a convolutional neural network to recover ground truth deformation by learning cloth embedding offsets for each skeletal pose. Our experiments show significant improvement over learning cloth offsets from body surface parameterizations, both quantitatively and visually, with prior state of the art having a mean error five standard deviations higher than ours. Without retraining, our neural network generalizes to other body shapes and T-shirt sizes, giving the user some indication of how well clothing might fit. Our results demonstrate the efficacy of a general learning paradigm where high-frequency details can be embedded into low-frequency parameterizations.

1. INTRODUCTION

Cloth is particularly challenging for neural networks to model due to the complex physical processes that govern how cloth deforms. In physical simulation, cloth deformation is typically modeled via a partial differential equation that is discretized with finite element models ranging in complexity from variational energy formulations to basic masses and springs, see e.g. Baraff & Witkin (1998) ; Bridson et al. (2002; 2003) ; Grinspun et al. (2003) ; Baraff et al. (2003) ; Selle et al. (2008) . Mimicking these complex physical processes and numerical algorithms with machine learning inference has shown promise, but still struggles to capture high-frequency folds/wrinkles. PCA-based methods De Aguiar et al. ( 2010 (1988) ; Lander (1998); Lewis et al. (2000) to capture some degree of the nonlinearity; the cloth is then represented via learned offsets from a co-dimension one skinned body surface. Building on this prior work, we propose replacing the skinned co-dimension one body surface parameterization with a skinned (fully) three-dimensional parameterization of the volume surrounding the body. We parameterize the three-dimensional space corresponding to the volumetric region of air surrounding the body with a tetrahedral mesh. In order to do this, we leverage the work of Lee et al. (2018; 2019) , which proposed a number of techniques for creating and deforming such a tetrahedral mesh using a variety of skinning and simulation techniques. The resulting kinematically deforming skinned mesh (KDSM) was shown to be beneficial for both hair animation/simulation Lee et al. ( 2018) and water simulation Lee et al. (2019) . Here, we only utilize the most basic version of the KDSM, assigning skinning weights to its vertices so that it deforms with the underlying joints similar to a skinned body surface (alternatively, one could train a neural network to learn more complex KDSM deformations). This allows us to make a very straightforward and fair comparison between learning offsets from a skinned body surface and learning offsets from a skinned parameterization of three-dimensional space. Our experiments showed an overall reduction in error of approximately 50% (see Table 2 and Figure 8 ) as well as the removal of visual/geometric artifacts (see e.g. Figure 9 ) that can be directly linked to the usage of the body surface mesh, and thus we advocate the KDSM for further study. The neural network we trained for a particular body can also be used to infer cloth with unique wrinkle patterns on different body shapes and T-shirt sizes without retraining (see supplemental material). In order to further illustrate the efficacy of our approach, we show that the KDSM is amenable to being used with recently proposed works on texture sliding for better three-dimensional reconstruction Wu et al. (2020b) as well as in conjunction with networks that use a postprocess for better physical accuracy in the L ∞ norm Geng et al. (2020) (see Figure 10 ). In summary, our specific contributions are: 1) a novel three-dimensional parameterization for virtual cloth adapted from the KDSM, 2) an extension (enabling plastic deformation) of the KDSM to accurately model cloth deformation, and 3) a learning framework to efficiently infer such deformations from body pose. The mean error of the cloth predicted in Jin et al. ( 2020) is five standard deviations higher than the mean error of our results.

2. RELATED WORK

Cloth: Data-driven cloth prediction using deep learning has shown significant promise in recent years. To generate clothing on the human body, a common approach is to reconstruct the cloth and body jointly Alldieck et al. 2020) parameterized cloth as a submesh of the SMPL body mesh and decomposed cloth deformation into low-frequency and high-frequency components. However, this parameterization limits cloth to be bound by the topology of SMPL, and the high-frequency folds/wrinkles added by the network are not constrained to match those in the ground truth data. In contrast, our method allows one to predict cloth deformation independent of a predefined PCA basis, and using Geng et al. (2020) ensures that folds/wrinkles are physically consistent. (2004; 2006) . These volumes are animated such that the embedded hairs follow the body as it deforms enabling efficient hair animation, simulation, and collisions. Interestingly, deforming a low-dimensional reference map that parameterizes high-frequency details has been explored in computational physics as well, particularly for fluid simulation, see e.g. Bellotti & Theillard (2019) .

3. SKINNING A 3D PARAMETERIZATION

We generate a KDSM using red/green tetrahedralization Molino et al. (2003); Teran et al. (2005a) to parameterize a three-dimensional volume surrounding the body. Starting with the body in the T-pose, we surround it with an enlarged bounding box containing a three-dimensional Cartesian grid. As is typical for collision bodies in computer graphics Bridson et al. (2003) , we generate a level set representation separating the inside of the body from the outside (see e.g. Osher & Fedkiw (2002) ). See Figure 1a . Next, a thickened level set is computed by subtracting a constant value from the current level set values (Figure 1b ). Then, we use red/green tetrahedralization as outlined in Molino



); Hahn et al. (2014) remove important high variance details and struggle with nonlinearities emanating from joint rotations and collisions. More recently, Gundogdu et al. (2019); Santesteban et al. (2019); Patel et al. (2020); Jin et al. (2020) leverage body skinning Magnenat-Thalmann et al.

(2018a;b); Xu et al. (2018); Alldieck et al. (2019a;b); Habermann et al. (2019); Natsume et al. (2019); Saito et al. (2019); Yu et al. (2019); Bhatnagar et al. (2019); Onizuka et al. (2020); Saito et al. (2020). In such cases, human body models such as SCAPE Anguelov et al. (2005) and SMPL Loper et al. (2015) can be used to reduce the dimensionality of the output space. To predict cloth shape, a number of works have proposed learning offsets from the body surface Guan et al. (2012); Neophytou & Hilton (2014); Pons-Moll et al. (2017); Lahner et al. (2018); Yang et al. (2018); Gundogdu et al. (2019); Santesteban et al. (2019); Patel et al. (2020); Jin et al. (2020) such that body skinning can be leveraged. There are a variety of skinning techniques used in animation; the most popular approach is linear blend skinning (LBS) Magnenat-Thalmann et al. (1988); Lander (1998). Though LBS is efficient and computationally inexpensive, it suffers from well-known artifacts addressed in Kavan & Žára (2005); Kavan et al. (2007); Jacobson & Sorkine (2011); Le & Hodgins (2016). Since regularization often leads to overly smooth cloth predictions, additional wrinkles/folds can be added to initial network inference results Popa et al. (2009); Mirza & Osindero (2014); Robertini et al. (2014); Lahner et al. (2018); Wu et al. (2020b); Patel et al. (2020). Most recently, Patel et al. (

3D Parameterization: Parameterizing the air surrounding deformable objects is a way of treating collisions during physical simulation Sifakis et al. (2008); Müller et al. (2015); Wu & Yuksel (2016). For hair simulation in particular, previous works have parameterized the volume enclosing the head or body using tetrahedral meshes Lee et al. (2018; 2019) or lattices Volino & Magnenat-Thalmann

