NEURON ACTIVATION ANALYSIS IN MULTI-JOINT ROBOT REINFORCEMENT LEARNING

Abstract

Recent experiments indicate that pre-training of end-to-end Reinforcement Learning neural networks on general tasks can speed up the training process for specific robotic applications. However, it remains open if these networks form general feature extractors and a hierarchical organization that are reused as apparent e.g. in Convolutional Neural Networks. In this paper we analyze the intrinsic neuron activation in networks trained for target reaching of robot manipulators with increasing joint number in a vertical plane. We analyze the individual neuron activity distribution in the network, introduce a pruning algorithm to reduce network size keeping the performance, and with these dense network representations we spot correlations of neuron activity patterns among networks trained for robot manipulators with different joint number. We show that the input and output network layers have more distinct neuron activation in contrast to inner layers. Our pruning algorithm reduces the network size significantly, increases the distance of neuron activation while keeping a high performance in training and evaluation. Our results demonstrate that neuron activity can be mapped among networks trained for robots with different complexity. Hereby, robots with small joint difference show higher layer-wise projection accuracy whereas more different robots mostly show projections to the first layer.

1. INTRODUCTION

Convolutional Neural Networks (CNN) are well known to demonstrate a strong general feature extraction capability in lower network layers. In these networks feature kernels can not only be visualized, pre-trained general feature extractors can also be reused for efficient network learning. Recent examples propose efficient reusability experimentally for Reinforcement Learning neural networks as well: Networks are pre-trained on similar tasks and continued learning for the goal application. Reusing (sub)networks that can be re-assembled for an application never seen before can reduce network training time drastically. A better understanding of uniform or inhomogeneous network structures also improves the evaluation of network performance as well unveils opportunities for the interpretability of networks which is crucial for the application of machine learning algorithms e.g. in industrial scenarios. Finally, methodologies and metrics estimating network intrinsic and inter correlations in artificial neural networks may also enhance the understanding of biological learning. Eickenberg et al. (2017) could recently demonstrate that layers serving as feature extractors in CNNs could actually be found in the Human Visual Cortex by correlating artificial networks to biological recordings. Successful experiments to re-use end-to-end learned networks for similar tasks leave open whether such networks also self-organize feature extractors or in a dynamical domain motion primitives. Here, we analyze neuron activation in networks in order to investigate activation distribution and mapping between different networks trained on similar robot reaching tasks. In this paper we consider a standard vertical space robot manipulator with variable number of revolute joints as the test setup for target reaching end-to-end Reinforcement Learning (RL) experiments. We introduce metrics applied to evaluate individual neuron activation over time and compare activity within individual networks all-to-all (every neuron is correlated to any other neurons in the network) and layer wise (only correlations between networks on the same layer are inspected). These metrics are utilized to set up a pruning procedure to maximize the information density in learned neural networks and reduce redundancy as well as unused network nodes. Exploiting these optimization

