TOWARD TRAINABILITY OF QUANTUM NEURAL NET-WORKS

Abstract

Quantum Neural Networks (QNNs) have been recently proposed as generalizations of classical neural networks to achieve the quantum speed-up. Despite the potential to outperform classical models, serious bottlenecks exist for training QNNs; namely, QNNs with random structures have poor trainability due to the vanishing gradient with rate exponential to the input qubit number. The vanishing gradient could seriously influence the applications of large-size QNNs. In this work, we provide a first viable solution with theoretical guarantees. Specifically, we prove that QNNs with tree tensor and step controlled architectures have gradients that vanish at most polynomially with the qubit number. Moreover, our result holds irrespective of which encoding methods are employed. We numerically demonstrate QNNs with tree tensor and step controlled structures for the application of binary classification. Simulations show faster convergent rates and better accuracy compared to QNNs with random structures.

1. INTRODUCTION

Neural Networks (Hecht-Nielsen, 1992) using gradient-based optimizations have dramatically advanced researches in discriminative models, generative models, and reinforcement learning. To efficiently utilize the parameters and practically improve the trainability, neural networks with specific architectures (LeCun et al., 2015) are introduced for different tasks, including convolutional neural networks (Krizhevsky et al., 2012) for image tasks, recurrent neural networks (Zaremba et al., 2014) for the time series analysis, and graph neural networks (Scarselli et al., 2008) for tasks related to graph-structured data. Recently, the neural architecture search (Elsken et al., 2019) is proposed to improve the performance of the networks by optimizing the neural structures. Despite the success in many fields, the development of the neural network algorithms could be limited by the large computation resources required for the model training. In recent years, quantum computing has emerged as one solution to this problem, and has evolved into a new interdisciplinary field known as the quantum machine learning (QML) (Biamonte et al., 2017; Havlíček et al., 2019) . Specifically, variational quantum circuits (Benedetti et al., 2019) have been explored as efficient protocols for quantum chemistry (Kandala et al., 2017) and combinatorial optimizations (Zhou et al., 2018) . Compared to the classical circuit models, quantum circuits have shown greater expressive power (Du et al., 2020a) , and demonstrated quantum advantage for the low-depth case (Bravyi et al., 2018) . Due to the robustness against noises, variational quantum circuits have attracted significant interest for the hope to achieve the quantum supremacy on near-term quantum computers (Arute et al., 2019) . Quantum Neural Networks (QNNs) (Farhi & Neven, 2018; Schuld et al., 2020; Beer et al., 2020) are the special kind of quantum-classical hybrid algorithms that run on trainable quantum circuits. Recently, small-scale QNNs have been implemented on real quantum computers (Havlíček et al., 2019) for supervised learning tasks. The training of QNNs aims to minimize the objective function f with respect to parameters θ. Inspired by the classical optimizations of neural networks, a natural strategy to train QNNs is to exploit the gradient of the loss function (Crooks, 2019) . However, the recent work (McClean et al., 2018) shows that n-qubit quantum circuits with random structures and large depth L = O(poly(n)) tend to be approximately unitary 2-design (Harrow & Low, 2009) , and the partial derivative vanishes to zero exponentially with respect to n. The vanishing gradient problem is usually referred to as the Barren Plateaus (McClean et al., 2018) , and could affect the

