LATERAL INHIBITION-INSPIRED STRUCTURE FOR CONVOLUTIONAL NEURAL NETWORK ON IMAGE CLASSIFICATION Anonymous

Abstract

Convolutional neural networks (CNNs) have become powerful and popular tools since deep learning emerged for image classification in the computer vision field. For better recognition, the dimension of both depth and width has been explored, leading to convolutional neural networks with more layers and channels. In addition to these factors, neurobiology suggests lateral inhibition (lateral antagonism, e.g. Mach band effect), a widely existing phenomenon for vision that increases the contrast and sharpness of nearby neuron excitation in the lateral direction to help recognition. However, such mechanism has not been well explored in the design of convolutional neural network. In this paper, we explicitly explore the filter dimension in the lateral direction and propose our lateral inhibition-inspired (LI) structure. Our naive design uses the low-pass filter to mimic the strength decay of lateral interaction from neighbors regarding the distance. One learnable parameter per channel is applied to set the amplitude of the low-pass filter by multiplication, which is flexible to model various lateral interactions (including lateral inhibition). The convolution result is then subtracted from the input, which could increase the contrast and sharpness for better recognition. Furthermore, a learnable scaling factor and shift are applied to adjust the value after subtraction. Our lateral inhibition-inspired (LI) structure works on both plain convolution and the convolutional block with residual connection, while being compatible with the existing modules. Preliminary results demonstrate obvious improvements on the ImageNet dataset for AlexNet (7.58%) and ResNet-18 (0.81%), respectively, with little increase in parameters, indicating the effectiveness of our brain-similar design to help feature learning for image classification from a different perspective.

1. INTRODUCTION

In recent years, convolutional neural networks (CNNs) (Hinton et al., 2012; Simonyan & Zisserman, 2015; Szegedy et al., 2015; He et al., 2016) have become powerful and popular tools since deep learning emerged for image classification in the computer vision field. They have recorded record-breaking performance and outperformed traditional methods (Quinlan, 1986; Cortes & Vapnik, 1995) with hand-crafted features (Lowe, 1999; Dalal & Triggs, 2005) on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) (Deng et al., 2009) . Today, convolutional neural networks still possess unique merits, as they have been studied the most, along with the fact that convolution has a strong connection with the human vision system and image processing, making them good models for feature learning research. Different factors have been explored to improve recognition performance of convolutional neural networks. VGGNet (Simonyan & Zisserman, 2015) applies a small convolution kernel size (3 × 3) for increased network depth, while ResNet (He et al., 2016) introduces deep residual learning to make training very deep deep networks feasible. The success of such networks indicates that depth is a crucial factor for recognition performance. Wide Residual Networks (Zagoruyko & Komodakis, 2016) , on the other hand, demonstrate width as another important factor to improved performance. In addition to these factors, neurobiology suggests the widely existing lateral inhibition (lateral antagonism, e.g. Mach band effect, shown in Fig. 1 ), a phenomenon that increases the contrast and sharpness of nearby neuron excitation in the lateral direction, also important to help feature learning. 2019) propose their surround modulation design, using a manually defined kernel from the difference of Gaussian (DoG) function, to explore the feasibility of incorporating the lateral inhibition mechanism. This fixed design is only applied to half channels of the initial layer (with better performance than applying to the full channels). It is still worth exploring flexible lateral inhibition-inspired design for convolutional neural network, to incorporate such mechanism and make it more similar to the brain in the training stage for image classification. In this paper, inspired by the findings of neurobiology, we explicitly explore the filter dimension in the lateral direction and propose our lateral inhibition-inspired (LI) structure, which incorporates the lateral inhibition mechanism in a flexible manner inside modern convolutional neural networks. Our simple naive design applies the low-pass filter with the central weight eliminated, to mimic the decay of inhibition strength from neighbors regarding the distance. To set the amplitude of the low-pass filter by multiplication, a learnable parameter per channel is used, for the flexibility of modeling various lateral interactions (including lateral inhibition). The convolution result is then subtracted from the input, which could increase the contrast and sharpness for better recognition. Furthermore, a learnable scaling factor and shift are applied to adjust the value after subtraction. Our lateral inhibition-inspired (LI) method works on both plain convolution and convolutional block with a residual connection, while being compatible with the existing modules. Preliminary results of our simple naive design demonstrate obvious improvements on the ImageNet dataset for AlexNet (7.58%), ResNet-18 (0.81%), respectively, with little parameter increase, indicating the effectiveness of our brain-similar to help feature learning for image classification from a different perspective. Our main contributions can be summarized as: • We propose a lateral inhibition-inspired structure for modern convolutional neural network design, which could participate in the training stage for image classification. • Our design is flexible compared to the fixed difference of Gaussian (DoG) filter, and could be applied to all layers (and channels) with competitive performance. • We are the few to explicitly model lateral inhibition as well as other lateral interaction to make the network more brain-similar, given the flexible weight design.



Figure 1: Illustration of Mach band effect. Due to lateral inhibition, the actual perception is different from the real input, leading to increased contrast and sharpness along the boundary (see curves towards opposite direction), where the dark area becomes darker and the bright area becomes brighter.

