UNSUPERVISED 3D OBJECT LEARNING THROUGH NEURON ACTIVITY AWARE PLASTICITY

Abstract

We present an unsupervised deep learning model for 3D object classification. Conventional Hebbian learning, a well-known unsupervised model, suffers from loss of local features leading to reduced performance for tasks with complex geometric objects. We present a deep network with a novel Neuron Activity Aware (NeAW) Hebbian learning rule that dynamically switches the neurons to be governed by Hebbian learning or anti-Hebbian learning, depending on its activity. We analytically show that NeAW Hebbian learning relieves the bias in neuron activity, allowing more neurons to attend to the representation of the 3D objects. Empirical results show that the NeAW Hebbian learning outperforms other variants of Hebbian learning and shows higher accuracy over fully supervised models when training data is limited.

1. INTRODUCTION

Supervised deep networks for recognizing objects from 3D point clouds have demonstrated high accuracy but generally suffer from poor performance when labeled training data is limited (Wu et al., 2015; Qi et al., 2017a; b; Wang et al., 2019; Maturana & Scherer, 2015) . On the other hand, self-supervised or unsupervised models can be trained without labeled data hence improving the performance in data efficient scenarios. Self-supervised learning methods have been studied for 3D object recognition mostly in an autoencoder setting, which necessarily reconstructs input to learn the representation (Achlioptas et al., 2018; Girdhar et al., 2016) . Unsupervised learning has also been applied to pre-process the input for an encoder but still largely relying on supervised learning (Li et al., 2018) . Conventionally, self-organizing maps and growing neural gas have been used as fully unsupervised learning for 3D objects while they aim to reconstruct the surface of the objects (do Rêgo et al., 2007; Mole & Araújo, 2010) . A fully unsupervised deep network for 3D object classification has rarely been studied. Unsupervised Hebbian learning is known to offer attractive advantages such as data efficiency, noise robustness, and adaptability for various applications (Najarro & Risi, 2020; Kang et al., 2022; Miconi et al., 2018; Zhou et al., 2022) . The basic Hebbian and anti-Hebbian learning refer to that synaptic weight is strengthened and weakened, respectively, when pre-and post-synaptic neurons are simultaneously activated (Hebb, 2005) . Many past efforts have developed variants of Hebb's rule. Examples include Oja's rule and Grossberg's rule (Oja, 1982; Grossberg, 1976) for object recognition (Amato et al., 2019; Miconi, 2021 ), ABCD rule (Soltoggio et al., 2007) for meta-learning and reinforcement tasks (Najarro & Risi, 2020), and another variant for hetero-associative memory (Limbacher & Legenstein, 2020) . However, Hebbian learning is often vulnerable to the loss of local features (Miconi, 2021; Bahroun et al., 2017; Bahroun & Soltoggio, 2017; Amato et al., 2019) . This is a major challenge for applying Hebbian rules for tasks with more complex geometric objects, such as object recognition from 3D point clouds. In this paper, we present an unsupervised deep learning model for 3D object recognition that uses a novel neuron activity-aware plasticity-based Hebbian learning to mitigate the vanishing of local features, thereby improving the performance of 3D object classification. We observe that, in networks trained with plain Hebbian learning, only a few neurons always activate irrespective of the object class. In other words, spatial features of 3D objects are represented by the activation of only a few specific neurons, which degrades task performance. We develop a hybrid Hebbian learning rule, referred to as the Neuron Activity Aware (NeAW) Hebbian, that relieves the biased activity. The key concept of NeAW Hebbian is to dynamically convert the learning rule of synapses associated with an output neuron from Hebbian to anti-Hebbian or vice versa depending on the activity of the output neuron. The reduction of bias allows a different subset of neurons to activate for different object classes, which increases class-to-class dissimilarity in the latent space. Our deep learning model uses a feature extraction module trained by NeAW Hebbian learning, and a classifier module is trained by supervised learning. The feature extractor designed as a multi-layer perceptron (MLP) transforms the positional vector of sampled points on 3D objects into high-dimensional space (Qi et al., 2017a; b) . The experimental results evaluated on ModelNet10 and ModelNet40 (Wu et al., 2015) show that the proposed NeAW Hebbian learning outperforms the prior Hebbian rules for efficient unsupervised 3D deep learning tasks. This paper makes the following key contributions: • We present a deep learning model for 3D object recognition with the NeAW Hebbian learning rule that dynamically controls Hebbian and anti-Hebbian learning to relax the biased activity of neurons. The NeAW Hebbian learning efficiently transforms spatial features of various classes of 3D objects into a high-dimensional space defined by neuron activities. • We analytically prove that the NeAW Hebbian learning relieves the biased activity of output neurons if the input is under a given geometric condition, while solely applying Hebbian or anti-Hebbian learning does not guarantee the relaxation of the skewed activity. • We analytically prove that purely Hebbian learning and anti-Hebbian learning on the biased neuron activity leads to a poor subspace representation with few principal components, thereby limiting the performance in the classification tasks. • We empirically demonstrate that the NeAW Hebbian learning rule outperforms the existing variants of Hebbian learning rules in the 3D object recognition task. We also show that NeAW Hebbian learning achieves higher accuracy than end-to-end supervised learning when training data is limited (data-efficient learning).

2. RELATED WORK AND BACKGROUND

Deep Learning Models for 3D Object Recognition Supervised 3D convolutional neural network (CNN) models have been developed to process a volumetric representation of 3D objects (Maturana & Scherer, 2015; Wu et al., 2015) but high sparsity in volumetric input increases computation cost, limiting applications to low-resolution point clouds. Multi-view CNN renders 3D objects into images at different views and processes these images using 2D CNN (Su et al., 2015) . VGG and ResNet-based models show good performance when the training images are well-engineered under proper multi-views (Su et al., 2018) . However, 2D CNN-based approaches are difficult to scale to complex 3D tasks (Qi et al., 2017a) . Recently low-complexity point-based models have been designed to process a point cloud where input is (x, y, z) coordinates of points (Qi et al., 2017a) . Self-supervised learning models have also been proposed based on autoencoder and generative adversarial networks (Wu et al., 2016; Sharma et al., 2016) . Both models accept a voxel representation of 3D objects and learn the latent representation of the objects by reconstructing the voxel. The learned representation is used as the input to an additional classifier. Our unsupervised learning does not use labels, but unlike existing self-supervised models, our approach does not reconstruct the objects. Hebbian Learning Models The variants of the Hebbian learning rule are given as: w(t + 1) =    w(t) + ηyx Hebb's rule w(t) + ηy(x -yw(t)) Oja's rule w(t) + ηy(x -w(t)) Grossberg's rule (1) where x, y, and w are the input, output, and weight, η is the learning rate. Hebb's rule is the basic form of Hebbian learning where weights are updated if both the input and output neuron fire (Hebb, 2005) . This linear association can be interpreted as biologically plausible principal component analysis (PCA) if the data samples are assumed to be zero-mean (Weingessel & Hornik, 2000) . However, the plain Hebb's rule is often vulnerable to the divergence of weight vectors as

