GENERALIZING AND TENSORIZING SUBGRAPH SEARCH IN THE SUPERNET

Abstract

Recently, a special kind of graph, i.e., supernet, which allows two nodes connected by multi-choice edges, has exhibited its power in neural architecture search (NAS) by searching better architectures for computer vision (CV) and natural language processing (NLP) tasks. In this paper, we discover that the design of such discrete architectures also appears in many other important learning tasks, e.g., logical chain inference in knowledge graphs (KGs) and meta-path discovery in heterogeneous information networks (HINs). Thus, we are motivated to generalize the supernet search problem on a broader horizon. However, none of the existing works are effective since the supernet's topology is highly task-dependent and diverse. To address this issue, we propose to tensorize the supernet, i.e., unify the subgraph search problems by a tensor formulation and encode the topology inside the supernet by a tensor network. We further propose an efficient algorithm that admits both stochastic and deterministic objectives to solve the search problem. Finally, we perform extensive experiments on diverse learning tasks, i.e., architecture design for CV, logic inference for KG, and meta-path discovery for HIN. Empirical results demonstrate that our method leads to better performance and architectures.

1. INTRODUCTION

Deep learning (Goodfellow et al., 2017) has been successfully applied in many applications, such as image classification for computer vision (CV) (LeCun et al., 1998; Krizhevsky et al., 2012; He et al., 2016; Huang et al., 2017) and language modeling for natural language processing (NLP) (Mikolov et al., 2013; Devlin et al., 2018) . While the architecture design is of great importance to deep learning, manually designing proper architectures for a certain task is hard and requires lots of human efforts or sometimes even impossible (Zoph & Le, 2017; Baker et al., 2016) . Recently, neural architecture search (NAS) techniques (Elsken et al., 2019) have been developed to alleviate this issue, which mainly focuses on CV and NLP tasks. Behind existing NAS methods, a multi-graph (Skiena., 1992 ) structure, i.e., supernet (Zoph et al., 2017; Pham et al., 2018; Liu et al., 2018) , where nodes are connected by edges with multiple choices, has played a central role. In such context, the choices on each edge are different operations, and the subgraphs correspond to different neural architectures. The objective here is to find a suitable subgraph in this supernet, i.e. better neural architectures for a given task. However, the supernet does not only arise in CV/NLP field and we find it also emerge in many other deep learning areas (see Table 1 ). An example is logical chain inference on knowledge graphs (Yang et al., 2017; Sadeghian et al., 2019; Qu & Tang, 2019) , where the construction logical rules can be modeled by a supernet. Another example is meta-path discovery in heterogeneous information networks (Yun et al., 2019; Wan et al., 2020) , where the discovery of meta-paths can also be modeled by a supernet. Therefore, we propose to broaden the horizon of NAS, i.e., generalize it to many deep learning fields and solve the new NAS problem under a unified framework. Since subgraphs are discrete objects (choices on each edge are discrete), it has been a common approach (Liu et al., 2018; Sadeghian et al., 2019; Yun et al., 2019) to transform it into a continuous optimization problem. Previous methods often introduce continuous parameters separately for each edge. However, this formulation cannot generalize to different supernets as the topological structures of supernets are highly task-dependent and diverse. Therefore, it will fail to capture the supernet's topology and hence be ineffective. • We broaden the horizon of existing supernet-based NAS methods. Specifically, we generalize the concept of subgraph search in supernet from NAS to other deep learning tasks that have graph-like structures and propose to solve them in a unified framework by tensorizing the supernet. • While existing supernet-based NAS methods ignore the topological structure of the supernet, we encode the supernet in a topology-aware manner based on the tensor network and propose an efficient algorithm to solve the search problem. • We conduct extensive experiments on various learning tasks, i.e., architecture design for CV, logical inference for KG, and meta-path discovery for HIN. Empirical results demonstrate that our method can find better architectures, which lead to state-of-the-art performance on various applications.

2.1. SUPERNET IN NEURAL ARCHITECTURE SEARCH (NAS)

There have been numerous algorithms proposed to solve the NAS problem. et al., 2020) . DARTS is the first to introduce deterministic formulation to NAS field, and SNAS uses a similar parametrized method with DARTS under stochastic formulation. NASP improves upon DARTS by using proximal operator (Parikh & Boyd, 2014 ) and activates only one subgraphs in each iteration to avoid co-adaptation between subgraphs.

2.2. TENSOR METHODS IN MACHINE LEARNING

A tensor (Kolda & Bader, 2009 ) is a multi-dimensional array as an extension to a vector or matrix. Tensor methods have found wide applications in machine learning, including network



A comparison of existing NAS/non-NAS works for designing discrete architectures based on our tensorized formulation for the supernet. "Topology" indicate topological structure of the supernet is utilized or not.In this paper, we propose a novel method TRACE to introduce a continuous parameter for each subgraph (all these parameters will form a tensor). Then, we propose to construct a tensor network (TN)(andrzej. et al., 2016; 2017)  based on the topological structure of supernet. For different tensor networks, we introduce an efficient algorithm for optimization on supernets. Extensive experiments are conducted on diverse deep learning tasks. Empirical results demonstrate that TRACE performs better than the state-of-the-art methods in each domain. As a summary, our contributions are as follows:

The first NAS work, NASRL(Zoph & Le, 2017), models the NAS problem as a multiple decision making problem and proposes to use reinforcement learning (RL)(Sutton & Barto, 2018)  to solve this problem. However, this formulation does not consider the repetitively stacked nature of neural architectures and is very inefficient as it has to train many different networks to converge. To alleviate this issue, NASNet(Zoph et al., 2017)  first models NAS as an optimization problem on supernet. The supernet formulation enables searching for transferrable architectures across different datasets and improves the searching efficiency. Later, based on the supernet formulation, ENAS(Pham et al.,  2018)  proposes weight-sharing techniques, which shares the weight of each subgraph in a supernet.

