DEEAPR: CONTROLLABLE DEPTH ENHANCEMENT VIA ADAPTIVE PARAMETRIC FEATURE ROTATION

Abstract

Understanding depth of an image provides viewers with a better interpretation of the 3D structures within an image. Photographers utilize numerous factors that can affect depth perception to aesthetically improve a scene. Unfortunately, controlling depth perception after the image has been captured is a difficult process as it requires accurate and explicit depth information. Also, defining a quantitative metric of a subjective quality (i.e., depth perception) is difficult which makes supervised learning a great challenge. To this end, we propose DEpth Enhancement via Adaptive Parametric feature Rotation (DEEAPR), which modulates the perceptual depth of an input scene using a single control parameter without the need for explicit depth information. We first embed content-independent depth perception of a scene by visual representation learning. Then, we train the controllable depth enhancer network with a novel modulator, parametric feature rotation block (PFRB), that allows for continuous modulation of a representative feature. We demonstrate the effectiveness of our proposed approach by verifying each component through an ablation study and comparison to other controllable methods.



Enhancing depth perception in 2D images exhibits a more realistic image content to the viewers as it allows for better interpretation of the 3D scene structure. The human visual system uses a variety of cues to inference depth information, such as disparity that arises from binocular vision or motion (Kim et al., 2016) . Depth cues in single static images, such as occlusion, shading, or blur, are referred to as pictorial depth cues (O'Shea et al., 1997) . Many artists and photographers utilize pictorial depth cues by adding synthetic effects to increase impression of depth in still images. For example, amplifying the defocus blur triggers the depth-of-focus cue, where the objects in the range of focus appear sharp and those further away appear blurry. However, enhancing depth perception of an image by manipulating depth cues without explicit depth information of a scene is a challenging task. Moreover, depth perception of a scene is a subjective quality that varies from image to image, which is difficult to learn in a supervised manner. Our final goal is to modulate the perceptual depth of an input scene using a single control parameter without the need for explicit depth information. While the traits of depth perception make supervised training very difficult, we rely on the recent success of unsupervised visual representation learning that considers a high dimensional space in which the degree of similarity is inversely correlated with the distance between instances. As the first step, we embed content-independent depth perception of a scene onto the representation space 1



Figure 1: Illustration of DEEAPR framework.

