ARBITRARY VIRTUAL TRY-ON NETWORK: CHAR-ACTERISTICS REPRESENTATION AND TRADE-OFF BE-TWEEN BODY AND CLOTHING

Abstract

Deep learning based virtual try-on system has achieved some encouraging progress recently, but there still remain several big challenges that need to be solved, such as trying on arbitrary clothes of all types, trying on the clothes from one category to another and generating image-realistic results with few artifacts. To handle this issue, we propose the Arbitrary Virtual Try-On Network (AVTON) that is utilized for all-type clothes, which can synthesize realistic try-on images by preserving and trading off characteristics of the target clothes and the reference person. Our approach includes three modules: 1) Limbs Prediction Module, which is utilized for predicting the human body parts by preserving the characteristics of the reference person. This is especially good for handling cross-category try-on task (e.g., long sleeves ↔ short sleeves or long pants ↔ skirts, etc.), where the exposed arms or legs with the skin colors and details can be reasonably predicted; 2) Improved Geometric Matching Module, which is designed to warp clothes according to the geometry of the target person. We improve the TPS-based warping method with a compactly supported radial function (Wendland's Ψ-function); 3) Trade-Off Fusion Module, which is to trade off the characteristics of the warped clothes and the reference person. This module is to make the generated try-on images look more natural and realistic based on a fine-tuning symmetry of the network structure. Extensive simulations are conducted and our approach can achieve better performance compared with the state-of-the-art virtual try-on methods.

1. INTRODUCTION

During the past few years, computer vision technology has been widely utilized in the extensive applications of artificial fashion. These applications include clothes detection Liu et al. ( 2016 Although these works have made some progress, most of them only focus on top cloth try-on tasks and cannot handle arbitrary try-on tasks (the top, bottom, or the whole clothes). This is of great practice in real-world applications and currently, there are few works focused on this try-on task. In addition, it still remains some ongoing challenges and limitations: 1) most benchmark datasets utilized for training virtual try-on methods mainly contain top clothes. As a result, the trained model can only handle the top clothing try-on task while cannot be adapted to work on the other bottom or whole clothing try-on task; 2) cross-category try-on task (e.g., long sleeves↔short sleeves or long pants↔skirts, etc.) is another challenge in the try-on task. A case in point is that when people aim to try on from long sleeves to short sleeves, some parts of people's arms will be exposed. Therefore, it is necessary to preserve the characteristics of the reference person and predict such an exposed human body when generating the image-realistic try-on results. However, most current methods Han et al. As a result, some bad try-on performances, e.g., the limbs of human beings are covered by clothes,



); Ge et al. (2019), clothes parsing Li et al. (2019); Gong et al. (2017), clothes attributions and categories * Corresponding authors. This work is supported by the National Natural Science Foundation of China(61971121, 61672365, 62106211). Our code and dataset are available at https://github.com/ LiuYuZzz/AVTON.

Figure 1: We propose AVTON that is trained with an all-type clothing dataset. It can be adapted to all-type clothing try-on tasks and get image-realistic results. The types of clothes are divided into the top, bottom, and whole, and we apply CP-VTON Wang et al. (2018a) and AVTON on them. The CP-VTON is retained with the all-type clothing dataset, and the AVTON (Vanilla) indicates the AVTON trained without LPM and Wendland's Ψ-function Wendland (1995), while the AVTON (Full) is just the opposite.

(2018); Wang et al. (2018a); Pandey & Savakis (2020); Yu et al. (2019) do not consider this issue.

