LMSER-PIX2SEQ: LEARNING STABLE SKETCH REPRE-SENTATIONS FOR SKETCH HEALING

Abstract

Sketch healing aims to recreate a complete sketch from the corrupted one. The sparse and abstract nature of the sketch makes it challenging. The features extracted from the corrupted sketch may be inconsistent with the ones from the corresponding full sketch. In this paper, we present Lmser-pix2seq to learn stable sketch representations against the missing information by employing a Least mean square error reconstruction (Lmser) block, which falls into encoder-decoder paradigm. Taking as input a corrupted sketch, the Lmser encoder computes the embeddings of structural patterns of the input, while the decoder reconstructs the complete sketch from the embeddings. We build bi-directional skip connections between the encoder and the decoder in our Lmser block. The feedback connections enable recurrent paths to receive more information about the reconstructed sketch produced by the decoder, which helps the encoder extract stable sketch features. The features captured by the Lmser block are eventually fed into a recurrent neural network decoder to recreate the sketches. Experimental results show that our Lmser-pix2seq outperforms the state-of-the-art methods in sketch healing, especially when the sketches are heavily masked or corrupted.

1. INTRODUCTION

Humans are able to complete things that are missing in life through their imagination, such as completing blanks, novel sequels and image repairs. Sketch healing task (Su et al., 2020) is one of the related works. Sketch healing is to synthesise a complete sketch that best resembles the partial input (Su et al., 2020; Qi et al., 2022) . Different from the image inpainting task (Pathak et al., 2016) , where photos have rich texture information, freehand sketches are highly abstract and sparse, making sketch healing quite challenging. The way to get a corrupted sketch, proposed by Su et al. (2020) , is to crop several local visual patches from a raster sketch image and drop some of them. This approach results in a corrupted sketch raster image and some remaining visual patches. Conventional sketch generation models (Chen et al., 2017; Zang et al., 2021) that take images as input can be used for sketch healing. However, these models designed for sketch synthesis are not comparable to SketchHealer-1.0 (Su et al., 2020), which was specifically designed for sketch healing. SketchHealer-1.0 constructs a graphical representation of the sketch by treating patches as nodes and connecting edges based on the nodes' temporal proximity, i.e., the drawing order. The graphic sketch representation realizes the information interaction between different patches in the same sketch, so as to achieve a better effect of healing. Based on SketchHealer-1.0, SketchHealer-2.0 (Qi et al., 2022) considered the relationship between the local reconstruction and the global semantic preservation. SketchHealer-2.0 requires the involvement of a pre-trained model to calculate the semantic similarity between the recreated sketch and the full sketch. SketchHealer-1.0 (Su et al., 2020) and SketchHealer-2.0 (Qi et al., 2022) build graphs that depend on drawing order, but this information is not always available. To overcome this difficulty, SketchLattice (Qi et al., 2021) proposes a novel lattice representation and takes image as input. However, during the data processing phase, the lattice approach causes some of the information in the raster sketch image to be lost, thus limiting SketchLattice's performance. Different from the state-of-the-art graph-structure models, which pass information between nodes to fill in the gaps, we expect that the network to take full advantage of the information in the raster sketch images and learn stable sketch representations in the absence of temporal information. Stable representations mean that the model extracts the features of the corrupted sketch as consistent 1

