QUANTUM FOURIER NETWORKS FOR SOLVING PARA-METRIC PDES

Abstract

Many real-world problems like modelling environment dynamics, physical processes, time series etc., involve solving Partial Differential Equations (PDEs) parameterized by problem-specific conditions. Recently, a deep learning architecture called Fourier Neural Operator (FNO) proved to be capable of learning solutions of given PDE families, for any initial conditions as input. Given the advancements in quantum hardware and the recent results in quantum machine learning methods, we propose three quantum circuits, inspired by the FNO, to learn this functional mapping for PDEs. The proposed algorithms are distinguished based on the trade-off between depth and their similarity to the classical FNO. At their core, we make use of unary encoding paradigm and orthogonal quantum layers, and introduce a new quantum Fourier transform in the unary basis. With respect to the number of samples, our quantum algorithm is proven to be substantially faster than the classical counterpart. We benchmark our proposed algorithms on three PDE families, namely Burger's equation, Darcy's flow equation and the Navier-Stokes equation, and the results show that our quantum methods are comparable in performance to the classical FNO. We also show an analysis of the image classification tasks where our proposed algorithms are able to match the accuracy of the CNNs, thereby showing their applicability to other domains.

1. INTRODUCTION

Solving Partial Differential Equations (PDEs) has been a very crucial step in understanding the dynamics of nature. They have been widely used to understand natural phenomenon such as heattransfer, modelling the flow of fluids, electromagnetism, etc. and lately have also found their applications in understanding the behavior of dynamical markets. In most of these applications, a closed form solution is difficult to find for the resulting PDEs and thus, classical solvers require a lot of evaluations to model the solution for a given PDE. To approximate such not-so-easily solvable PDE's, there has been an extensive research based on neural networks. A group of these methods Yu et al. (2018) ; Raissi et al. (2019) ; Bar & Sochen (2019) aimed at learning the solution function for an instance of PDE and thus requires to be re-trained every time the parameters/conditions in the PDE change. The other set of methods Zhu & Zabaras (2018) ; Adler & Öktem (2017) ; Bhatnagar et al. (2019) targeted at learning over a family of PDEs but for a specific resolution dependent, making these methods limited to the discretization or the sampling density used in the training data. A recent work Li et al. (2020) addressed both these issues and posed the problem as learning a function-to-function mapping for parametric PDEs. Experiments on widely popular PDEs showed that it was effective in learning the mapping from a parametric initial condition function to the solution operator for a family of PDEs. The method proposes a Fourier layer, which uses a learnable linear transform sandwiched between a Fourier Transform (F T ) and an Inverse Fourier Transform (IF T ) operation. This is similar to the convolution operation as it also translates to multiplication in the Fourier space. The major bottleneck which might hinder the scalability of this classical Fourier Neural Operator (FNO) is its time complexity, limited by the classical Fourier Transform (FT) and Inverse Fourier Transform (IFT) operations inside the Fourier Layer. As shown previously by Musk (2020) , these operations are much faster when deployed on quantum hardware. Similar advantages have led to significant developments in learning approaches based on near term quantum computing. The initial demonstrations of these algorithms involved experiments on a small-scale hardware Farhi & Neven (2018) ; Coyle et al. (2020) ; Cappelletti et al. (2020) ; Grant et al. (2018) which established their effectiveness in extracting patterns. Following this, many works Abbas et al. (2021) ; Mari et al. (2020) ; Beer et al. (2020) ; Allcock et al. (2020) proposed small-scale implementations of fully connected quantum neural networks on near term hardware. Other proposals Kerenidis et al. (2019) ; Cong et al. (2019) for deploying convolution-based learning methods on quantum devices showed effective training in practice. Furthermore, Chakrabarti et al. (2019) proposed quantum-hardware implementation for generative adversarial networks. A different approach, where the inputs are encoded as unary states, using the two-qubits quantum gate RBS (Reconfigurable Beam Splitter) was proposed in a recent work Johri et al. (2021) . This encoding gave rise to the use of orthogonal properties of pure quantum unitaries, as proposed in Kerenidis et al. (2021) for training, for instance, orthogonal feed-forward networks to damp the gradient based issues while learning. It used a pyramid circuit based on parameterized RBS gates to implement a learnable orthogonal matrix as compared to the existing classical approaches which offer approximate orthogonality at the cost of increased training time. This orthogonality in neural networks results in much smoother convergence and also lesser parameters as shown by Li et al. (2019) for feed-forward neural networks and Wang et al. (2020) for convolutional nets. The effectiveness of these orthogonal quantum networks was further shown in another work on medical image classification Mathur et al. (2021) problem. We also use a similar idea to do the multi-channel intermediate linear transform using orthogonal matrix but using parameterized butterfly circuits instead of pyramid circuits. Exploiting these advantages offered by quantum computing, in this work, we propose a new kind of Quantum Fourier Transform (QFT), which operates on the unary states and a learnable Quantum Linear Transform. We further propose three quantum algorithms inspired by this classical Fourier operation which are faster than the classical operation and require fewer parameters for the same architecture, thereby boosting their scalability. Given the input of dimension N s × N c , where N s corresponds to number of samples per PDE and N c correspond to feature dimension, the order of time complexity corresponding to Fourier Layer (FL) and proposed algorithms is shown in table 1 . Table 1: Comparison of order of time/depth complexities (O) of the proposed circuits with the existing classical Fourier Layer (FL). Here Ns denote the sampling dimension, Nc denote the feature dimension where Ns Nc and K (usually in range 4-16) denotes the maximum number of modes allowed Li et al. (2020) . This implies that the proposed quantum algorithms would be faster than the classical method. Each quantum circuit requires Nc + Ns qubits and K independent parallel circuits are required by the Parallelized QFNO.

Method

Classical FL Sequential Quantum FL Parallel Quantum FL Compound Quantum FL Complexity N c +N s log(N s ) Klog(N c )+N c log(N s ) log(N c )+N c log(N s ) log(N c +K)+N c log(N s ) # qubits - N c + N s N c + N s N c + N s # parallel circuits - 1 K 1 The first algorithm replicates the classical operation on a quantum circuit. The other two algorithms are modifications of the first circuit designed for the noisy learning process offered by the near term quantum hardware. We test all the three proposed algorithms on all the three PDEs evaluated in the classical FNO paper Li et al. (2020) namely the Burgers equation, Darcy's Flow equation and Navier Stokes equation on the synthetic datasets used in that paper. We also test our algorithms against the Convolutional neural networks (CNNs) on benchmark datasets for image classification namely MNIST, Fashion- MNIST Xiao et al. (2017) , Pneumonia-MNIST Yang et al. (2021) . In all the experiments, three algorithms perform similarly and comparable to state-of-the-art FNO for PDEs. Also, they perform decently on the image classification tasks.

2. CLASSICAL FOURIER NEURAL OPERATOR

Given a training set comprising the family of a Partial Differential Equation, the classical FNO Li et al. (2020) aims to learn a functional mapping from a parameterized initial condition to the solution function for this family. This means given an initial condition function characterizing a PDE instance, sampled at different points, it should predict the solution function values at those points at the inference time. To formulate it, given two functional spaces A and U along with a set of observations {a j , u j } (a j ∼ µ is an i.i.d. sequence sampled from some function f ∈ A), it learns a parametric mapping G : A × Θ → U. To achieve this, it proposed a learning network based on iteratively applying a new kind of layer which it termed as the Fourier Layer. The layer consists of two parts, the top part involves firstly projecting the input to the Fourier domain and then applying a linear transform (refer fig. 1 ) to first K modes and crop the rest and then reproject back to the original domain. This is somewhat similar to convolution but the updates would be taking place in the Fourier space. The lower part is just a simple convolution and doesn't play any significant role in improving performance. In this paper, we propose purely quantum algorithms for this top part as a Quantum Fourier Layer and term the network as Quantum Fourier Neural Operator (QFNO). Now, we look at the mathematical details of the classical Fourier layer for a 1D PDE case (eg. Burger's equation), showing the inputs and outputs of each transformation involved. Given a classical input x ∈ R Nc×Ns , where N c corresponds to the number of channels/features in the input and N s corresponds to number of samples/observations per function corresponding to a PDE, we denote the corresponding output of this classical operation by y ∈ R Nc×Ns . As the quantum matrices are orthogonal and the l 2 -norm of any quantum state vector is 1, we first normalize the input x (normalization doesn't have any significant impact on the optimization process) as follows: a ij = x ij ||x|| 2 (1) Now, following the classical operation, applying the Fourier Transform (FT) along the second dimension to this input (N s ), F T (a i ) = a f i where a i = (a ij ) j∈[1,Ns] and a f i denote the corresponding fourier-transformed coefficients. Denoting the maximum number of modes with K, intermediate linear transform is a matrix W ∈ R Nc×Nc×K and W k ∈ R Nc×Nc denotes this matrix W indexed along the last dimension (corresponding to k th mode). Here, we assume this matrix W k is orthogonal and thus, can be implemented by a quantum layer. This transformation is now applied to the first K modes along the N s dimension resulting in the following output: (W j a f j ) j∈[1,K] , (a f j ) j∈[K+1,Ns] where the vector a f j = (a f ij ) i∈[1,Nc] . In the classical paper, they discard the rest of modes (make the vectors zero) and we leave them unchanged here. We verified that this doesn't impacts the optimization significantly. For these transformed K vectors a f j ∈ R Nc , we define b f j = W j a f j or more explicitly b f ij = Nc t (W j it a f tj ) . Given W j , a j and b j , we can now write the input after Fourier Transform and the intermediate linear transform as: (b f j ) j∈[1,K] , (a f j ) j∈[K+1,Ns] Finally, applying Inverse Fourier Transform (IF T ) operation on this transformed input, along the N s dimension, results in the following output as the output of the complete operation: y i = IF T (b f ij ) j∈[1,K] , (a f ij ) j∈[K+1,Ns] where y i = (y ij ) j∈[1,Ns] . The Time Complexity of this complete Fourier Layer (F T +linear transform+IF T ) is O(N c + N s log(N s )).

3. QUANTUM FOURIER NETWORKS

In this section we propose three quantum circuits to replace the Fourier layer of the FNO, namely the Sequential (Section 3.2.1), Parallel (Section 3.2.2), and the Compound (Section 3.2.3). We compare their computational complexities (see Table 1 ) and their efficiency in practice in the following sections. Before this, we also define a learnable orthogonal linear transform, using a quantum circuit, called parameterized butterfly. Using this as the building block, we propose the Quantum Circuit to carry out the intermediate linear transform proposed in the classical FNO (Section 2).

3.1. QUANTUM BUTTERFLY CIRCUITS

An efficient implementation of FFT, proposed by Cooley & Tukey (1965) , involves connections in shape of radix-2 butterflies resulting in a logarithmic depth complexity. Taking inspiration from this, we propose such butterfly shaped Quantum Circuits to carry out: a) A learnable linear transform and b) A Quantum Fourier Transform (QFT), both operating only on the unary states. Here each radix-2 operation is implemented using RBS-gates Foxen et al. (2020) . Refer supplementary for details regarding the unitary corresponding to this gate. Parameterized Quantum Butterfly. Due to the logarithmic depth offered by the butterfly circuits for FT, which is important given the noisy near-term quantum hardware, we propose such a butterfly shaped Quantum Circuit to carry out a learnable linear transform. Here, each radix-2 butterfly shaped connection is replaced by parameterized RBS gates similar to a concurrent work Cherrat et al. (2022) . This results in a orthogonal linear transform only on the unary quantum states similar to the pyramidal circuit proposed by Kerenidis et al. (2021) . We further propose a controlled version of these linear quantum layers to perform the operation similar to the intermediate linear transform operation in the fourier layers. Unary Quantum Fourier Transform. Here, we propose a butterfly-shaped Quantum Circuit to carry out the QFT on unary states. It involves replacing each radix-2 operation in the Butterfly FT by a single qubit gate and an RBS-gate with -π/4 as the angle. Refer supplementary for more details and circuit diagrams for the Parameterized Butterfly layer and Unary QFT.

3.2. QUANTUM CIRCUITS FOR FOURIER LAYER

To perform the classical FNO operation using a quantum circuit, the final output state of the circuit should correspond to the quantum-state encoding of the output resulting from this classical operation. Therefore, we load this classical output into a quantum state and for this we use the unary dataloading similar to the one used in Johri et al. (2021) due to its logarithmic depth. It loads an N -dimensional classical vector into normalized N quantum states corresponding to the unary representation of numbers from 1 to N . The circuit consists of only the Reconfigurable Beam Splitter (RBS) gates. Using controlled version of this circuit, a matrix loader can be defined which we discuss in the supplementary and it loads the normalized vectors comprising the matrix into quantum states. Thus, for the classical output y = (y 1 , • • • , y Nc ), the encoded quantum state |y is: |y = i |e i |y i = Nc i Ns j IF T (b f ij ) j∈[1,K] , (a ij ) j∈[K+1,Ns] j |e i |e j (5) where |e i denotes the state in unary notation with 1 being at the i th position. Note there is no normalization factor, as the inputs (a 1 , • • • , a Nc ) were assumed normalized in the previous subsection. Now, we discuss the three proposed circuit namely sequential, parallelized and the compound circuit. The sequential circuit replicates the classical operation discussed above and the other two are modifications to it to make the quantum algorithm more efficiently deployable on hardware.

3.2.1. SEQUENTIAL QUANTUM CIRCUIT FOR FOURIER LAYER

Figure 1 shows the diagram for this circuit. Lower register of N s qubits correspond to the second dimension and upper register of N c qubits correspond to the N c dimension used in the above mathematical description of the classical operation. We begin by discussing the quantum dataloading followed by quantum transforms corresponding to the classical ones formulated above. Dataloading. Similar to the quantum-state encoding of the classical output defined above, here also we use the unary matrix loader to load the given input into a quantum state. As shown in Figure 1 , the circuit initially contains N c controlled unary dataloaders, each encoding a row of N s inputs into the quantum state on the lower register. The loaded state, after controlled operation X Nc in the circuit, can be written as: Nc i Ns j a ij |e i |e j (6) where the coefficients a ij correspond to normalized matrix elements. Denoting the given classical input as x ∈ R Nc×Ns we have (refer supplementary for details): a ij = x ij ||x|| 2 (7) Now, to apply the QF T operation on lower register, the state can be re-arranged as follows: Nc i |e i ( Ns j a ij |e j ) Quantum Fourier Transform (QF T ). Given a normalized real vector x = (x i ) i=1,..M , and its Fourier Transform x f = (x f i ) i=1,. .M , the QF T operation in unary basis and its inverse, the IQF T operation, can be defined as follows: QF T ( i x i |e i ) = i x f i |e i and IQF T ( i x f i |e i ) = i x i |e i (9) Applying this QF T operation to the lower register, on the state in eq. 8, we have: Nc i |e i QF T ( Ns j a ij |e j ) = Nc i |e i Ns j a f ij |e j = Nc i Ns j a f ij |e i |e j Quantum Linear Transform. To realize the intermediate transform used in the classical operation, we implement an orthogonal matrix W k , corresponding to the one in the classical operation, and realize it using a parameterized butterfly circuit. Furthermore, we propose a "k-butterfly" which is a parameterized butterfly circuit (P k ) on the top register controlled by the k th qubit of the lower register. As shown in Figure 1 , it involves using an ancilla qubit (top-most) initialized to state |1 . When this qubit is in state |1 , then combining it with upper register results in a hamming weight 2 basis |h 2 . Any transformation on this using the RBS gates would result in an another state of the hamming weight 2 basis. We can ignore all these hamming weight 2 states using a post select operation. To apply the k-butterfly, we flip this qubit back to 0 if the k th qubit of the lower register is |1 , using a CNOT gate. Thus, all the states of upper qubits are in |h 2 basis, except for the ones corresponding to k th unary state of the lower register: Ns j Nc i a f ij |e i |e j |1 → j=k Nc i a f ij |0 |e i |e j + j =k Nc i a f ij |1 |e i |e j Now, applying K such k-butterfly circuits, we perform the operation P j on both the unary and the |h 2 states. We consider the unary states in our analysis since we discard the states in other basis: K j P j ( Nc i a f ij |0 |e i ) |e j + Ns j=K+1 ( Nc i a f ij |1 |e i ) |e j (12) Considering only the top register, the k-butterfly operation P j corresponds to the sub-matrix W j ∈ R Nc×Nc . The overall matrix (a f ij ) can be decomposed into N s vectors a f j = (a f ij ) i∈[1,Nc] . Then, for the first K vectors a f j ∈ R Nc we will have b f j = W j a f j , where b f j will be the same as in the classical case. Finally, the output state of this circuit after IQF T on the lower register becomes: Nc i |e i IQF T   K j b f ij |e j + Ns j=K+1 a f ij |e j   (13) Since IQF T ( i x f i |e i ) = i x i |e i , where IF T (x f ) = x, this implies that j th component of IF T would be same as the coefficient of j th state in IQF T . From this we can conclude that the state in eq. 5 is equivalent to the state in eq. 13 and thus, this circuit replicates the classical operation. The depth complexity of this circuit is O((K + 1)log(N c ) + (N c + 2)log(N s )). 

3.2.2. PARALLELIZED QUANTUM CIRCUIT FOR FOURIER LAYER

Given the multiplicative noise model for the NISQ devices, the depth of the learnable part, which is proportional to K for the sequential circuit, might be a hindrance to learning and it also increases the computation time. A useful modification then can be parallelizing the learnable butterfly circuits. Figure 2 shows this modified version of the sequential circuit, which consists of K quantum circuits operating parallely and each implementing only one learnable circuit controlled by one of the top K qubits in the lower register. As all the circuits upto the learnable part are similar to the sequential circuit, we can directly write the state after the QFT, using eq. 10, as:   Ns j Nc i (a f ij ) k |e i k |e j k   K k=1 where the index k denotes the k th parallel circuit. Also, in the k th parallel circuit, the learnable butterfly part is controlled by the k th qubit of the lower register. We recall that the pyramid applied on the top register is effectively mapping the vector a f j to b f j (see Eq.13) and thus we can write the updated state of the circuits as:   j =k Nc i (a f ij ) k |e i k |e j k + Nc i (b f ij ) k |e i k |e k k   K k=1 Now, applying IQFT on the lower register in each of the circuits independently: =   Nc i |e i k IQF T   (b f ik ) k |e k k + j =k (a f ij ) k |e j k     K k=1 In supplementary, we derive that this joint output of these circuits cannot lead to the output of the classical operation even after measurement and classical post-processing. We also show that how measuring without applying the IQF T operation and instead applying the classical IF T , after some post-processing, can lead to the same output as the classical operation. The depth of this parallel version of sequential circuit is (2)log(N c ) + (N c + 2)log(N s ) and a total of K quantum circuits are required to execute this parallely.

3.2.3. COMPOUND QUANTUM CIRCUIT FOR FOURIER LAYER

As highlighted in previous subsection, depth of the parameterized part of the sequential circuit might make the learning process difficult on currently available noisy quantum hardware. To deal with this, we propose another variant of the sequential circuit, where, instead of having the parameterized circuit controlled by top K qubits of lower register, we span a a single bigger pyramid over the upper register (N s ) and top K qubits in the lower register. Note the upper and lower registers are unary independently. Therefore, if we jointly consider the upper register and top K from the lower register, then there would be states with hamming weight two as well, corresponding to the scenario when 1 is in one of the top K qubits of the lower register. Thus, instead of a unary basis (which corresponds to having hamming weight only one), we have hamming weight one and hamming weight two basis. Lets imagine a butterfly circuit on N c + K qubits. The complete unitary will be a 2 Nc+K × 2 Nc+K block diagonal matrix with each block corresponding to a subspace with fixed hamming weight Kerenidis & Prakash (2022) , B = B 1 ⊗ B 2 ⊗ ... ⊗ B n , where B i correspond to the block diagonal unitary for subspace with hamming weight i. Since our input has hamming weight 1 or 2, we only care about unitaries B 1 and B 2 . B 1 will be of size (N c +K)×(N c +K) and B 2 of Nc+K 2 × Nc+K 2 . Given these unitaries, we take the input state and split it into (N c + K) hamming weight 1 states and Nc+K 2 hamming weight 2 states. Then we apply the B 1 and B 2 to these states. Also, B 1 is a butterfly matrix and B 2 is the corresponding compound order two matrix Horn & Johnson (2012). Figure 2 shows the diagram for this circuit. Given the circuit is similar to the sequential circuit till the QF T operation, the state of this circuit would be same as the one in eq. 10. We now separate this complete state into two sets of states corresponding to hamming weight 1 and 2: = Nc i K j a f ij |e i |e j + Nc i Ns j=K+1 a f ij |e i |e j ( ) where the first term corresponds to hamming weight 2 states |h 2 and similarly the second term corresponds to hamming weight 1 states |h 1 . Lets first focus on the term corresponding to |h 1 . It does not contain the states where the qubits in upper register are all 0 and the 1 lies in the top K qubits of the lower register. It implies that the coefficients of all these states should be taken as zero. Therefore, the state corresponding to this |h 1 can also be written as: Nc i Ns j=K+1 a f ij |e i |e j + K i Ns j=K+1 0 |e 0 |e ij ( ) where |e 0 denotes the state corresponding to no ones in the upper N c register and |e ij denotes the hamming weight 2 state for the lower register, where i and j denote the positions of 1. Similarly, if we consider the first term in eq. 17 corresponding to |h 2 , we further have to include states where both ones are in upper register or both ones in top K of the lower register. These new states again would have zero coefficients. As a result, we can write the term corresponding to |h2 in eq. 17 as: Nc i Nc j>i 0 |e ij |e 0 + Nc i K j a f ij |e i |e j + K i K j>i 0 |e 0 |e ij (19) This results in a total of Nc+K 2 states. Now, we apply the butterfly circuit, corresponding to unitary B 1 , to the |h 1 state in eq. 18. For notational consistency we denote this operation as a multiplication with matrix W 1 ∈ R (Nc+K)× (Nc+K) . It results in the transformed coefficients b ij : b f ij = Nc t (W 1 it a f tj ) + Nc+K t=Nc+1 (W 1 it × 0) i ∈ [1, N c + K] j ∈ [K + 1, N s ] Furthermore, we also apply a post select operation to preserve the basis, selecting only the states with non-zero coefficient before applying the B 1 . Now, for the hamming weight 2 state, the second order compound unitary B 2 , denoted by the matrix W 2 ∈ R q×q where q = Nc+K 2 , has each of its elements corresponding to the determinant of a 2 × 2 submatrix of W 1 : W 2 i,j = W 2 a,b W 2 a+k,b+k -W 2 a+k,b W 1 a,b+k for some a, b, k < N c + K. After applying this unitary B 2 on |h 2 states, their coefficients c ij are: c ij = ( Nc 2 ) t W 2 it × 0 + NcK t W 2 i,t+( Nc 2 ) × a f tj + ( Nc +K 2 ) t=NcK+( Nc 2 ) W 2 it × 0 (22) Similar to the case of hamming weight 1, here also we use a post select operation to discard the states which initially had coefficients zero thereby preserving the basis. Combining the transformed |h 1 , |h 2 states and applying IQF T on the lower register, the final output state of this circuit is: Nc i |e i IQF T   K j c ij |e j + Ns j=K+1 b ij |e j   (23) and the order of the depth complexity is log( N c + K) + log(N c ) + (N c + 2)log(N s ).

4. EXPERIMENTS

In this section, we analyze our proposed Quantum algorithms for solving PDEs and Image classification tasks. We compare them against the state-of-the-arts in both the domains, i.e., classical Fourier Networks (PDEs) and CNNs (image classification), for both the tasks. All the details related to architecture/hyperparamaters are provided at the end of the paper. All the experiments shown in this section are simulated, i.e., the quantum operations have been simulated using classical matrices corresponding to quantum unitaries. The currently available quantum hardware is limited to 8-10 qubits and is also too noisy for deeper circuits. Deploying these algorithms on a quantum hardware will involve noise which can be either due to noisy quantum gates, measurement, environment, etc. of relative error in estimating this mapping among the classical Fourier Layer, classical CNNs and proposed quantum circuits for the Fourier Layers, across different resolutions. The quantum circuits perform comparably to the classical Fourier Layer and much better than classical CNNs. Darcy's Flow Equation. For this case, it is a 2D PDE where the aim is to learn the mapping from the diffusion coefficient function to the solution function in presence of a forcing function. All of them are functions of positional coordinates only. Figure 3 shows the relative error for the 2D-version of all the methods in solving this PDE, across different resolutions. Here also the three quantum Figure 3 (right) shows the performance comparison for this equation between our proposed circuits and classical methods. It shows the convergence comparison for this family with viscosity ν fixed to 1e -3 for all the methods. Here again, it can be observed that all the proposed circuits and the classical Fourier method perform significantly better than CNNs. Also, from Table 2 , sequential circuit performs similarly to the classical method and the others converge at a slightly higher error. 4 shows the results for this evaluation. It can be observed that our proposed algorithms perform decently in the imageclassification task as well.

5. CONCLUSION

We proposed a quantum algorithm to carry out the recently proposed classical Fourier Neural Operator on the quantum hardware. We further proposed two more quantum algorithms, which perform a different operation than the classical algorithm and can be much more efficiently deployed on a quantum hardware. The aim is to make the learning process more efficient when using the noisy quantum hardware. Experimental results further verify that proposed quantum circuits perform efficiently when solving PDEs. The sequential circuit is quite similar to the best performing classical algorithm for PDEs and performs decently on image classification as well. An interesting future direction can be further modifying the learning process of the compound circuit so that it outperforms the sequential circuit at the same time being more efficient to deploy.

REPRODUCIBILITY STATEMENT

We have used the JAX framework to develop the methods discussed in the paper. All the experiments shown in the paper are simulated by defining the matrices using the unitaries corresponding to the quantum gates used in the paper. The architectural and hyperparameter details for both classical and quantum Fourier Layers are same as the ones used in Li et al. (2020) . All the experiments involve a 4-layered architecture corresponding to both classical Fourier baseline and quantum Fourier circuits. The architecture for CNNs used in the Burger and Darcy Flow PDE experiement is a Fully Convolution Networks based architecture proposed in Zhu & Zabaras (2018) . For the Navier stokes, we use the ResNet architecture for CNNs, comprising 18 Residual Blocks as used in Li et al. (2020) . For the CNNs in the image classification tasks, we use a four layered architecture with 64 channels and also use a max-pooling layer after the second and fourth layer. For both the tasks, each classical or quantum layer is immediately followed by a ReLU activation and Batch Normalization operation. For the classification tasks, finally the output of each of the architectures is fed to a linear layer with projection heads equal to number of classes for that task. For the classical FNOs and proposed quantum FNOs except the compound quantum FNO, K is fixed to 16 for 1D PDE and image classification and 12 for 2D PDEs, and N c = 64 for the 1D PDE and 32 for the 2D PDEs. For image classification, N c is set to 16. For the compound quantum FNO, due to less parameteric requirements K is set to 16 and N c to 48 for all the tasks. The input for the PDEs having N c initially 1 is first projected to the the required N c using a linear layer for all the classical and quantum fourier methods similar to Li et al. (2020) . The final output of all the layers of N c features is back projected to N c = 1 using another linear layer. For all the PDEs, unless specified, 1000 train instances and 200 test instances are used and training is carried out for 500 epochs using Adam optimizer with learning rate initialized to 0.001 and halved after every 100 epochs. It is fixed to 0.001 for the classification task. Number of epochs for the classification tasks can be seen in the plots. to N unary quantum states. One of the proposed dataloaders is shown in Figure 6 for 8 qubits. It involves firstly flipping the first qubit to state |1 . Then it involves applying (N-1) RBS gates with precomputed angles (α i ) for loading this vector. It can be observed that it is a log-depth circuit. For a classical input vector a = (a 1 ,...,a N ), it results in the following quantum state: |a = 1 ||a|| N i a i |e i ( ) where |e i denotes the unary basis states for N qubits. There will be total N such state and coefficient of each of these states would correspond to the normalized classical vector element. To load the input matrix b ∈ R Nc×Ns we can superpose its row vectors b i , where each of these vectors would be first loaded using some controlled vector-loader. To do so, we first load the norms corresponding to each of these rows ||b i || using the norm-loader shown in the circuit diagrams in the paper. It results in the following state on the upper register: 1 ||b|| Nc i ||b i || |e i (26) Now, for each of the unary states |e i in the upper register, we load the corresponding row vectors (b ij ) j∈ [1, Ns] in the lower register using a controlled version of the unary vector loader (controlled operations labelled ||X i || in the circuit diagrams) which involves applying the unary vector loader and its adjoint along with CNOT gates. For more details on this loader, refer Cherrat et al. (2022) . The final loaded state after N c of these controlled unary vector loaders is: where (c ij ) k are explicitly given by using the equation for the fourier transform as follows: (c ij ) k = 1 N s   j =k (a f ij ) k e i 2πj Ns + (b f ik ) k e i 2πk Ns   Similarly writing c ij for the sequential circuit discussed in the paper (using eq. 13): c ij = 1 N s   Ns j=K+1 a f ij e i 2πj Ns + K j b f ij e i 2πj Ns   Comparing the above two equations leads to the observation that coefficients in Eq. 31 wouldn't be a subset of coefficients in Eq. 30 and there is no closed form classical processing/transformation to achieve this. Thus, this parallel circuit results in a somewhat different operation which is intuitively similar to the sequential circuit. Given, the experimental results, this operation is also effective in dealing with PDEs/Images, at the same time being more effective than sequential circuit under noisy scenarios. However, if we remove the IQF T operation from this circuit and instead apply the classical IF T , measuring after eq. 15, we get the following K matrices after applying the square root operation: (b f k ) k , (a f j ) j =k K k=1 where b f j and a f j have been defined previously. There are K such N c × N s matrices. In case we combine b f k from all of the K matrices with (a f j ) j∈[K+1,Ns] from any of the K matrices, suppose the first one, it leads to the following N c × N s matrix: (b f j ) j∈[1,K] , (a f j ) j∈[K+1,Ns] which is exactly same as eq. 3. Thus, this circuit (without the IQFT) followed by some classical post-processing and IFT can replicate the classical fourier layer operation.

D WHY COMPOUND MATRIX ?

The intuition behind why a different operation than classical, in form of this compound matrix, might work lies in the expressive power of this operation. The matrix for the sequential circuit can be interpreted as a special case of this compound matrix where instead of trainable connections, upper and lower registers interact using a fixed control gate. Due to this special connection, this transform is making the convergence efficient. This compound matrix, thus, is a more expressive version of the linear transform and also with a lower depth. Using it was further backed by the assumption that if optimized carefully it might lead to either a more parallelized version of the sequential circuit or a completely different and a more efficient learned matrix. Also, this is more a quantum-native operation which would be quite complex to carry out classically and thus can provide a better demonstration of quantum-advantage.



(a) Left: Classical Fourier Layer (b) Right: Sequential Quantum Circuit for FNO.

Figure 1: Left: Intermediate Linear Transform operation performed in the classical FNO, after applying the Fourier Transform Li et al. (2020). As shown, it performs a linear transform on the features corresponding to the first K modes in the Fourier Transformed input. Right: The proposed Sequential Quantum Circuit which replicates the classical FNO operation. Further details regarding it are given in Section 3.2.1. The dashed box incorporates the controlled butterfly incorporating a controlled norm-loader and the parametric butterfly (P k ).

Figure 2: Left: Parallelized version of the Sequential Quantum Circuit to minimize the depth of the learning part, thus making it more efficient when deployed on the noisy hardware. For each mode (out of the top K) in the transformed input, there is a different circuit to perform the parameterized matrix transform. Right: Another variant of the sequential circuit where instead of controlled butterfly circuits, there is a Compound Butterfly Circuit spanning upper register and top K qubits of lower register.

Figure 3: Left: Performance comparison (relative error as used in Li et al. (2020)) of the classical fourier networks, CNNs and the three circuit proposals for a quantum fourier layer on the Burger's 1D PDE equation across different resolutions. The quantum fourier circuits are quite close to the performance of the classical fourier baseline and much better than classical CNNs in minimizing the error. Middle: Same comparison on the Darcy's 2D PDE for different resolutions. A similar relative performance is observed where error in CNNs is much larger and is increasing w.r.t. resolution, whereas the error in the other four (3 quantum circuits and classical layer) is quite similar. Right: Convergence comparison for the Navier-Stokes equation with v = 1e -3, trained for 500 epochs.

Figure 4: Left: Performance comparison of the CNNS, classical fourier layer and the proposed quantum circuits on the MNIST dataset. It can be observed that all of them perform quite similarly, classical CNNs being the best. Middle: A similar comparison on the Pneumonia-MNIST Yang et al. (2021) dataset. The performance of CNNs is somewhat noisy here whereas it is smoother in case of the sequential circuit, both converging to a similar value. The compound quantum circuit and the classical Fourier baseline are also quite close to the CNNs in convergence. Right: Same comparison on the FashionMNIST Xiao et al. (2017) data. Here, a significant difference in the performance is observed with CNNS being the best followed by the Sequential circuit.

Figure 6: Quantum circuit for loading a classical vector into unary basis Johri et al. (2021). It comprises of (N -1) RBS gates where the angles for each of these gates (α i ) are pre-calculated using the input. See Johri et al. (2021) for further details.

state, after the IQFT, for the parallelized circuit in Eq.16 in the paper: denote coefficients of this state corresponding to k th circuit by (c ij ) k and thus re-writing it as: ij ) k |e i k |e j k

Comparison of parameters required by one layer of the proposed circuits and the existing classical Fourier Layer along with error analysis for different ν and T values for the 2D case of a Navier Stokes equation.

Appendix

A QUANTUM BUTTERFLY CIRCUITS As discussed in the paper, replacing each radix-2 butterfly shaped operation in the butterfly diagram for Fast Fourier Transform (FFT) by a parameterized 2-qubit gate, Reconfigurable Beam Splitter (RBS) gate, results in a parameterized butterfly circuit (See Figure 5 ). The gate is parameterized by a single parameter θ and the corresponding unitary is:It can be observed that it applies transformations on the |01 and |10 basis and is an identity operation on the remaining. These both comprise the unary basis or hamming weight 1 basis for two qubits. It can then be understood that it applies transformation between the states with same hamming weight basis and thus, preserves the hamming weight. Therefore, if we load our input to a basis with a certain hamming weight, then after any transformation made of these RBS gates, the basis states corresponding to non-zero coefficients in the output will have that same 

B DATALOADERS

Since our input is a N c × N s matrix, to load it into a quantum state, the matrix can be interpreted as N c vectors each of size N s . The resultant should be a superposition of these vectors. In the paper we discussed a recent work Johri et al. (2021) which used dataloaders to load N dimensional input

