RECOVERING GEOMETRIC INFORMATION WITH LEARNED TEXTURE PERTURBATIONS

Abstract

Regularization is used to avoid overfitting when training a neural network; unfortunately, this reduces the attainable level of detail hindering the ability to capture high-frequency information present in the training data. Even though various approaches may be used to re-introduce high-frequency detail, it typically does not match the training data and is often not time coherent. In the case of network inferred cloth, these sentiments manifest themselves via either a lack of detailed wrinkles or unnaturally appearing and/or time incoherent surrogate wrinkles. Thus, we propose a general strategy whereby high-frequency information is procedurally embedded into low-frequency data so that when the latter is smeared out by the network the former still retains its high-frequency detail. We illustrate this approach by learning texture coordinates which when smeared do not in turn smear out the high-frequency detail in the texture itself but merely smoothly distort it. Notably, we prescribe perturbed texture coordinates that are subsequently used to correct the over-smoothed appearance of inferred cloth, and correcting the appearance from multiple camera views naturally recovers lost geometric information.



Since neural networks are trained to generalize to unseen data, regularization is important for reducing overfitting, see e.g. Goodfellow et al. (2016); Scholkopf & Smola (2001) . However, regularization also removes some of the high variance characteristic of much of the physical world. Even though high-quality ground truth data can be collected or generated to reflect the desired complexity of the outputs, regularization will inevitably smooth network predictions. Rather than attempting to directly infer highfrequency features, we alternatively propose to learn a low-frequency space in which such features can be embedded. We focus on the specific task of adding highfrequency wrinkles to virtual clothing, noting that the idea of learning a low-frequency embedding may be generalized to other tasks. (2020) . Rather than attempting to amend such errors directly, we perturb texture so that the rendered cloth mesh appears to more closely match the ground truth. See Figure 1 . Then given texture perturbations from at least two unique camera views, 3D geometry can be accurately reconstructed Hartley & Sturm (1997) to recover high-frequency wrinkles. Similarly, for AR/VR applications, correcting visual appearance from two views (one for each eye) is enough to allow the viewer to accurately discern 3D geometry. Our proposed texture coordinate perturbations are highly 1



Figure 1: Texture coordinate perturbations (texture sliding) reduce shape inference errors: ground truth (blue), prediction (orange).

Because cloth wrinkles/folds are high-frequency features, existing deep neural networks (DNNs) trained to infer cloth shape tend to predict overly smooth meshes Alldieck et al. (2019a); Daněřek et al. (2017); Guan et al. (2012); Gundogdu et al. (2019); Jin et al. (2020); Lahner et al. (2018); Natsume et al. (2019); Santesteban et al. (2019); Wang et al. (2018); Patel et al.

