MULTI-TASK STRUCTURAL LEARNING USING LOCAL TASK SIMILARITY INDUCED NEURON CREATION AND REMOVAL

Abstract

Multi-task learning has the potential to improve generalization by maximizing positive transfer between tasks while reducing task interference. Fully achieving this potential is hindered by manually designed architectures that remain static throughout training. On the contrary, learning in the brain occurs through structural changes that are in tandem with changes in synaptic strength. Thus, we propose Multi-Task Structural Learning (MTSL) that simultaneously learns the multi-task architecture and its parameters. MTSL begins with an identical single task network for each task and alternates between a task learning phase and a structural learning phase. In the task learning phase, each network specializes in the corresponding task. In each of the structural learning phases, starting from the earliest layer, locally similar task layers first transfer their knowledge to a newly created group layer before they are removed. MTSL then uses the group layer in place of the corresponding removed task layers and moves on to the next layers. Our empirical results show that MTSL achieves competitive generalization with various baselines and improves robustness to out-of-distribution data. 1

1. INTRODUCTION

Artificial Neural Networks (ANNs) have exhibited strong performance in various tasks essential for scene understanding. Single-Task Learning (STL) (Yu et al., 2021; Wang et al., 2020b; Orsic et al., 2019) has been largely at the center of this exhibit driven by custom task-specific improvements. Despite these improvements, using single task networks for the multiple tasks required for scene understanding comes with notable problems such as a linear increase in computational cost and a lack of inter-task communication. Multi-Task Learning (MTL), on the other hand, with the aid of shared layers provides favorable benefits over STL such as improved inference efficiency and positive information transfer between tasks. However, a notable drawback of sharing layers is task interference. Existing works have attempted to alleviate task interference by modifying the architecture (Kanakis et al., 2020; Liu et al., 2019) , determining which tasks to group together using a similarity notion (Standley et al., 2020; Fifty et al., 2021; Vandenhende et al., 2020) , balancing task loss functions (Kendall et al., 2018; Liu et al., 2019; Yu et al., 2020; Lin et al., 2019) , or learning the architecture (Guo et al., 2020; Lu et al., 2017) . Although these methods have shown promise, progress can be made by drawing inspiration from the brain, which is the only known intelligent system that excels in multi-task learning. The inner mechanisms of the brain, although not fully understood, can guide research in ANNs through simplified notions. Neuron creation and neuron removal (Maile et al., 2022) are simplified notions that can aid in the automated design of Multi-Task Networks (MTNs). Neuron removal presents the opportunity to start from a dense set of neurons and move toward a sparse set of neurons. In the early stages of brain development, neural circuits consist of excess neurons and connections that provide a rich information pipeline (Maile et al., 2022) . This pipeline allows neural circuits to learn specialized functions while undergoing neuron removal and synaptic pruning (Riccomagno & Kolodkin, 2015) . Thereby, moving from a dense architecture consisting of multiple single-task networks to a sparse multi-task architecture could be beneficial. 1 Code will be made available after acceptance. 1

