FUNDAMENTAL LIMITS IN FORMAL VERIFICATION OF MESSAGE-PASSING NEURAL NETWORKS

Abstract

Output reachability and adversarial robustness are among the most relevant safety properties of neural networks. We show that in the context of Message Passing Neural Networks (MPNN), a common Graph Neural Network (GNN) model, formal verification is impossible. In particular, we show that output reachability of graph-classifier MPNN, working over graphs of unbounded, but finite size, nontrivial degree and sufficiently expressive node labels, cannot be verified formally: there is no algorithm that answers correctly (with yes or no), given an graphclassifier MPNN, whether there exists some valid input to the MPNN such that the corresponding output satisfies a given specification. However, we also show that output reachability and adversarial robustness of node-classifier MPNN can be verified formally when a limit on the degree of input graphs is given a priori. We discuss the implications of these results, for the purpose of obtaining a complete picture of the principle possibility to formally verify GNN, depending on the expressiveness of the involved GNN models and input-output specifications.

1. INTRODUCTION

The Graph Neural Network (GNN) framework, i.e. models that compute functions over graphs, has become a goto technique for learning tasks over structured data. This is not surprising since GNN application possibilities are enormous, ranging from natural sciences (Kipf et al. (2018) ; Fout et al. (2017) ) over recommender systems (Fan et al. (2019) ) to general knowledge graph applications which itself includes a broad range of applications (Zhou et al. (2020) ). Naturally, the high interest in GNN and their broad range of applications including safety-critical ones, for instance in traffic situations, impose two necessities: first, a solid foundational theory of GNN is needed that describes possibilities and limits of GNN models. Second, methods for assessing the safety of GNN are needed, in the best case giving guarantees for certain safety properties. Compared to the amount of work on performance improvement for GNN or the development of new model variants, the amount of work studying basic theoretical results about GNN is rather limited. Some general results have been obtained as follows: independently, Xu et al. (2019) and Morris et al. (2019) showed that GNN belonging to the model of Message Passing Neural Networks (MPNN) (Gilmer et al. (2017) ) are non-universal in the sense that they cannot be trained to distinguish specific graph structures. Furthermore, both relate the expressiveness of MPNN to the Weisfeiler-Leman graph isomorphism test. This characterisation is thoroughly described and extended by Grohe (2021) . Loukas (2020) showed that MPNN can be Turing universal under certain conditions and gave impossibility results of MPNN with restricted depth and width for solving certain graph problems. Similarly, there is a lack of work regarding safety guarantees for GNN, or in other words work on formal verification of GNN. Research in this direction is almost exclusively concerned with certifying adversarial robustness properties (ARP) of node-classifying GNN (see Sect. 1.1 for details). There, usually considered ARP specify a set of valid inputs by giving a center graph and a bounded budget of allowed modifications and are satisfied by some GNN if all valid inputs are classified to the same, correct class. However, due to the nature of allowed modifications, these properties cover only local parts of the input space, namely neighbourhoods around a center graph. are pairwise disjoint, and whenever (v, v ′ ) ∈ D and v ∈ V i then v ′ ∈ V i+1 or vice-versa, and for each node v ∈ V i , i ≥ 1 there is exactly one v ′ ∈ V i-1 such that (v ′ , v) ∈ D. We call k the depth of graph B. A d-tree is a d-graph that is a tree. Neural networks. We only consider classical feed-forward neural networks using ReLU activations given by re(x) = max(0, x) across all layers and simply refer to these as neural networks (NN). We use relatively small NN as building blocks to describe the structure of more complex ones. We call these small NN gadgets and typically define a gadget by specifying its computed function. This way of defining a gadget is ambiguous as there could be several, even infinitely many NN computing the same function. An obvious candidate will usually be clear from context. Let N be a NN. We call N positive if for all inputs x we have N (x) ≥ 0. We call N upwards bounded if there is n with n ∈ R such that N (x) ≤ n for all inputs x. Message passing neural networks. A Message Passing Neural Network (MPNN) Gilmer et al. (2017) consists of layers l 1 , . . . , l k followed by a readout layer l read , which gives the overall output of the MPNN. Each regular layer l i computes l i (x, M) = comb i (x, agg i (M)) where M is a multiset, a usual set but with duplicates, agg i an aggregation function, mapping a multiset of vectors onto a single vector, comb i a combination function, mapping two vectors of same dimension to a single one. In combination, layers l 1 , . . . , l k map each node v ∈ V of a graph G = (V, D, L) to a vector x k v in the following, recursive way: x 0 v = L(v) and x i v = l i (x i-1 v , M i-1 v ) where M i-1 v is the multiset of vectors x i-1 v ′ of all neighbours v ′ ∈ Neigh(v). We distinguish two kinds of MPNN, based on the form of l read : the readout layer l read of a node-classifier MPNN computes l read (v, M k ) = read (x k v ) where v is some designated node, M k is the multiset of all vectors x k v and read maps a single vector onto a single vector. The readout layer of a graph-classifier MPNN computes l read (M k ) = read ( v∈V x k v ). We denote the application of a node-classifier MPNN N to G and v by N (G, v) and the application of a graph-classifier N to G by N (G). In this paper, we make common choices (cf. Gilmer et al. (2017) ; Barceló et al. (2020) ; Wu et al. (2021) ) for the form of the aggregation, combination and readout parts: agg i (M) = x∈M x, comb i (x, M) = N i (x, agg i (M)) where N i is a NN and read (M) = N r ( x∈M x) respectively read (x) = N r (x) where, again, N r is a NN. Input and output specifications. An input specification over graphs (resp. pairs of graphs and nodes) φ is some formula, set of constraints, listing etc. that defines a set of graphs (resp. pairs of graphs and nodes) S φ . If a graph G (resp. pair (G, v)) is included in S φ we say that it is valid regarding φ or that it satisfies φ, written G |= φ, resp. (G, v) |= φ. Analogously, an output specification over vectors ψ defines a set of valid or satisfying vectors of equal dimensions. Typically, we denote a set of input specifications by Φ and a set of output specifications by Ψ. Formal verification of adversarial robustness and output reachability properties. An adversarial robustness property (ARP) P is a triple P = (N, φ, ψ) where N is a GNN, φ some input specification and ψ some output specification. We say that P holds iff for all inputs I |= φ we have N (I) |= ψ. We denote the set of all ARP with φ ∈ Φ, ψ ∈ Ψ and graph-classifier or node-classifier by ARP graph (Φ, Ψ) respectively ARP node (Φ, Ψ). We simply write ARP(Φ, Ψ) when we make no distinction between graph-or node-classifiers. Analogously, an output reachability property (ORP) Q is a triple Q = (N, φ, ψ), which holds iff there is input I |= φ such that N (I) |= ψ, and we define ORP graph (Φ, Ψ), ORP node (Φ, Ψ) and ORP(Φ, Ψ) accordingly. Let P be a set of safety properties like ARP graph (Φ, Ψ) or ORP node (Φ, Ψ). We say that P is formally verifiablefoot_0 if there is an algorithm A satisfying two properties for all P ∈ P: first, if P holds then A(P ) = ⊤ (completeness) and, second, if A(P ) = ⊤ then P holds (soundness).

3. OVERVIEW OF RESULTS

This work presents fundamental (im-)possibility results about formal verification of ARP and ORP of MPNN. Obviously, such results depend on the considered sets of specifications. All specification sets used in this work are described in detail in Appendix A. First, we establish a connection between ORP and ARP. For a set of output specifications Ψ we define Ψ = {ψ | ψ ∈ Ψ} where ψ defines exactly the set of vectors which do not satisfy ψ. We have that Ψ = Ψ. Lemma 1. ORP(Φ, Ψ) is formally verifiable if and only if ARP(Φ, Ψ) is formally verifiable. Proof. Note that the ARP (N, φ, ψ) holds iff the ORP (N, φ, ψ) does not hold. Hence, any algorithm for either of these can be transformed into an algorithm for the other problem by first complementing the output specification and flipping the yes/no answer in the end. This connection between ARP and ORP, while usually not given fomally, is folklore. For example, see the survey by Huang et al. (2020) , describing the left-to-right direction of Lemma 1. Our first core contribution is that, in contrast to verification of ORP of classical NN, there are natural sets of graph-classifier ORP, which cannot be verified formally. Let Φ unb be a set of graph specifications, allowing for unbounded, but finite size, non-trivial degree and sufficiently expressive labels, and let Ψ eq be a set of vector specifications able to check if a certain dimension of a vector is equal to some fixed integer. Theorem 1 (Section 4). ORP graph (Φ unb , Ψ eq ) is not formally verifiable. Let Ψ leq be a set of vector specifications, satisfied by vectors where for each dimension there is another dimension which is greater or equal. Now, Ψ class := Ψ leq is a set of vector specifications, defining vectors where a certain dimension is greater than all others or, in other words, outputs which can be interpreted as an exact class assignment. We can easily alter the proof of Theorem 1 to argue that ORP graph (Φ unb , Ψ leq ) is also not formally verifiable (see Section 4). Then, Lemma 1 implies the following result for ARP of graph-classifier MPNN. Corollary 1. ARP graph (Φ unb , Ψ class ) is not formally verifiable. Thus, as soon as we consider ORP or ARP of graph-classifier MPNN over parts of the input space, including graphs of unbounded, but finite size, with sufficient degree and expressive labels, it is no longer guaranteed that they are formally verifiable. To better understand the impact of our second core contribution, we make a short note on classical NN verification. There, a common choice of specifications over vectors are conjunctions of linear inequalities I c i x i ≤ b where c i , b are rational constants and x i are dimensions of a vector. Such specifications define convex sets and, thus, we call the set of all such specifications Ψ conv . Let Φ bound be a set of graph-node specifications, bounding the degree of valid graphs and using constraints on labels in a bounded distance to the center node which can be expressed by vector specifications as described above.foot_1 Now, it turns out that as soon as we bound the degree of input graphs, ORP of node-classifier MPNN with label constraints and output specifications from Ψ conv can be verified formally. Theorem 2 (Section 5). ORP node (Φ bound , Ψ conv ) is formally verifiable. Again, Lemma 1 implies a similar result for ARP of node-classifier MPNN. Obviously, we have Ψ class ⊆ Ψ conv . Corollary 2. ARP node (Φ bound , Ψ class ) is formally verifiable. This byproduct of Theorem 2 considerably extends the set of input specifications for which ARP of node-classifier MPNN is known to be formally verifiable. In particular, the literature (see Section 1.1) gives indirect evidence that ARP node (Φ neigh , Ψ class ) can be verified formally where Φ neigh is a set of specifications defined by a center graph and a bounded budget of allowed structural modifications as well as label alternations restricted using box constraints, which can be expressed using vector specifications of the form given above. Thus, Φ neigh ⊆ Φ bound . The results above, in addition to some immediate implications, reveal major parts of the landscape of MPNN formal verification, depicted in Figure 1 Ψ eq output spec. input spec. ARP graph (Cor.1) Figure 1 : Overview of core results. of input, resp. output specifications, loosely ordered by expressiveness. The three most important impressions to take from this visualisation are: first, the smaller the classes of specifications, the stronger an impossibility result becomes. Note that Theorem 1 and Corollary 1 naturally extend to more expressive classes of specifications (indicated by the red, squiggly arrows up and to the right). Second, results about the possibility to do formal verification grow in strength with the expressive power of the involved specification formalisms; Theorem 2 and Corollary 2 extend naturally to smaller classes (indicated by the green, squiggly arrows down and to the left). Third, the results presented here are not ultimately tight in the sense that there is a part of the landscape, between Φ bound and Φ unb , for which the status of decidability of formal verification remains unknown. Remark An interesting observation is that formal verification of ORP of node-classifier MPNN is impossible as soon as we allow for input specifications that can express properties like ∃v∀v ′ E(v, v ′ ), stating that a valid graph must contain a "master node" that is connected to all other nodes. Then the same reduction idea as seen in Section 4 can be used to show that formal verification is no longer possible.

4. THE IMPOSSIBILITY OF FORMALLY VERIFIYING ORP AND ARP OF GRAPH-CLASSIFIER MPNN OVER UNBOUNDED GRAPH CLASSES

The ultimate goal of this section is to show that ORP graph (Φ unb , Ψ eq ) is not formally verifiable. Note that we use a weak form of Φ unb here. See Appendix A for details. To do so, we relate the formal verification of ORP graph (Φ unb , Ψ eq ) to the following decision problem: given a graph-classifier N with a single output dimension, the question is whether there is graph G such that N (G) = 0. We call this problem graph-classifier problem (GCP). Lemma 2. If GCP is undecidable then ORP graph (Φ unb , Ψ eq ) is not formally verifiable. Proof. By contraposition. Suppose ORP graph (Φ unb , Ψ eq ) was formally verifiable. Then there is an algorithm A such that for each (N, true, y = 0) ∈ ORP graph (Φ unb , Ψ eq ) we have: A returns ⊤ if and only if (N, true, y = 0) holds. But then A can be used to decide GCP. Using this lemma, in order to prove Theorem 1, it suffices to show that GCP is undecidable, which we will do in the remaining part of this section. The proof works as follows: first, we define a satisfiability problem for a logic of graphs labeled with vectors, which we call Graph Linear Programming (GLP) as it could be seen as an extension of ordinary linear programming on graph structures. We then prove that GLP is undecidable by a reduction from Post (1946) 's Correspondence Problem (PCP). From the form of the reduction we infer that the graph linear programs in its image are of a particular shape which can be used to define a -therefore also undecidable -fragment, called Discrete Graph Linear Programming (DGLP). We then show how this fragment can be reduced to GCP, thus establishing its undecidability in a way that separates the structural from the arithmetical parts in a reduction from PCP to GCP. As a side-effect, with GLP we obtain a relatively natural undecidable problem on graphs and linear real arithmetic which may possibly serve to show further undecidability results on similar graph neural network verification problems.

4.1. FROM PCP TO GLP

We begin by defining the Graph Linear Programming problem GLP. Let X = {x 1 , . . . , x n } be a set of variables. A node condition φ is a formula given by the syntax φ ::= n i=1 a i x i + b i (⊙x i ) ≤ c | φ ∧ φ | φ ∨ φ where a j , b j , c ∈ Q. Intuitively, the x i are variables for a vector of n real values, constituting a graph's node label, and the operator ⊙ describes access to the node's neighbourhood, resp. their labels. We write sub(φ) for the set of subformulas of φ and Var(φ) for the set of variables occurring inside φ. We use the abbreviation t = c for t ≤ c ∧ -t ≤ -c. Let G = (V, D, L) be a graph with L : V → R n . A node condition φ induces a set of nodes of G, written [[φ]] G , and is defined inductively as follows. v ∈ [[ n i=1 a i x i + b i (⊙x i ) ≤ c]] G iff n i=1 a i L(v) i + b i ( v ′ ∈Nv L(v ′ ) i ) ≤ c v ∈ [[φ 1 ∧ φ 2 ]] G iff v ∈ [[φ 1 ]] G ∩ [[φ 2 ]] G v ∈ [[φ 1 ∨ φ 2 ]] G iff v ∈ [[φ 1 ]] G ∪ [[φ 2 ]] G If v ∈ [[φ] ] G then we say that v satisfies φ. A graph condition ψ is a formula given by the syntax ψ ::= n i=1 a i x i ≤ c | ψ ∧ ψ, where a i , c ∈ Q. The semantics of ψ, written [[ψ]], is the subclass of graphs G = (V, D, L) with L : V → R n ƒsuch that G ∈ [[ n i=1 a i x i ≤ c]] iff n i=1 a i ( v∈V L(v) i ) ≤ c, G ∈ [[ψ 1 ∧ ψ 2 ]] iff G ∈ [[ψ 1 ]] ∩ [[ψ 2 ]]. Again, if G ∈ [[ψ]] then we say that G satisfies ψ. The problem GLP is defined as follows: given a graph condition ψ and a node condition φ over the same set of variables X = {x 1 , . . . , x n }, decide whether there is a graph G = (V, D, L) with L : V → R n such that G satisfies ψ and all nodes in G satisfy φ. Such an L = (ψ, φ) is called a graph linear program, which we also abbreviate as GLP. It will also be clear from the context whether GLP denotes a particular program or the entire decision problem. As stated above, we show that GLP is undecidable via a reduction from Post's Correspondence Problem (PCP): given P = {(α 1 , β 1 ), (α 2 , β 2 ), . . . , (α k , β k )} ⊆ Σ * × Σ * for some alphabet Σ, decide whether there is a non-empty sequence of indices i 1 , i 2 , . . . , i l from {1, . . . , k} such that α i1 α i2 • • • α i l = β i1 β i2 • • • β i l . The α i , β i are also called tiles. PCP is known to be undecidable when |Σ| ≥ 2, i.e. we can always assume Σ = {a, b}. For example, consider the solvable instance P 0 = {(aab, aa), (b, abb), (ba, bb)}. It is not hard to see that I = 1, 3, 1, 2 is a solution for P 0 . Furthermore, the corresponding sequence of tiles can be visualised as shown in Figure 2 . The upper word is produced by the α i parts of the tiles and the lower one by the β i . The end of one and beginning of the next tile are visualised by the vertical part of the step lines. Theorem 3. GLP is undecidable. Proof sketch. We sketch the proof here and give a full version in Appendix B.1. The proof is done by establishing a reduction from PCP. The overall idea is to translate each PCP instance P to a GLP L P with the property that P is solvable if and only if L P is satisfiable. Thus, the translation must be such that L P is only satisfiable by graphs that encode a valid solution of P . The encoding is depicted for the solution of P 0 shown in Figure 2 in form of solid lines and nodes in Figure 3 . The word w α = α 1 α 3 α 1 α 2 is represented by the chain of yellow nodes from left to right in such way that there is a node for each symbol w i of w α . If w i = a then x a = 1 and x b = 0 of the corresponding node and vice-versa if w i = b. Analogously, β 1 β 3 β 1 β 2 is represented by the blue chain. The borders between two tiles are represented as edges between the yellow and blue nodes corresponding to the starting positions of a tile. The encoding as a graph uses additional auxiliary nodes, edges and label dimensions, in order to ensure that the labels along the yellow and blue nodes indeed constitute a valid PCP solution, i.e. the sequences of their letter labels are the same, and they are built from corresponding tiles in the same order. In Figure 3 , these auxiliary nodes and edges are indicated by the dashed parts. GLP seems to be too expressive in all generality for a reduction to GCP, at least it does not seem (easily) possible to mimic arbitrary disjunctions in an MPNN. However, the node conditions φ resulting from the reduction from PCP to GLP are always of a very specific form: φ = φ ′ ∧ φ discr where φ discr = i∈I m∈M x i = m with M ⊂ N enforces dimensions x i to be discrete and φ ′ has the following property. Let X be the set of dimensions not discretized by φ discr . For each  φ 1 ∨ φ 2 ∈ sub(φ ′ ) it is the case that Var(φ 1 ) ∩ X = ∅ or Var(φ 2 ) ∩ X = ∅. L ∈ GLP. Let m, n ∈ R with m ≤ n and M = {i 1 , i 2 , . . . , i k } ⊆ N such that i j ≤ i j+1 for all j ∈ {1, . . . , k -1}. We use the auxiliary gadget ⟨x ∈ [m; n]⟩ := re(re(x -n) -re(x -(n + 1)) + re(m -x) + re((m -1) -x)) to define the gadgets ⟨x ≤ m⟩ := re(re(x -m) -re(x -(m + 1))) and ⟨x ∈ M⟩ := re ⟨x ∈ [i 1 ; i k ]⟩ + k-1 j=1 re( (ij+1-ij ) 2 -(re(x - ij +ij+1 2 ) + re( ij +ij+1 2 -x))) . Each of the gadgets above fulfils specific properties which can be inferred from their functional forms without much effort: let r ∈ R. Then, ⟨r ≤ m⟩ = 0 if and only if r ≤ m, and ⟨r ∈ M⟩ = 0 if and only if r ∈ M. Furthermore, both gadgets are positive and ⟨x ≤ m⟩ is upwards bounded for all m by 1 with the property that |r -m| ≥ 1 implies ⟨r ≤ m⟩ = 1. We give a formal proof in Appendix B.2. We use ⟨x = m⟩ as an abbreviation for ⟨-x ≤ -m⟩ + ⟨x ≤ m⟩. The input size of N L equals the amount of variables occurring in φ and ψ. N L has one layer with two output dimensions y 1 discr and y 1 cond and the readout layer has a single output dimension y r . The subformula φ discr = i∈I m∈Mi x i = m is represented by y 1 discr = i∈I ⟨x i ∈ M i ⟩ and then checked using y 1 discr = 0 in the readout layer. The remaining part of φ is represented in output dimension y 1 cond in the following way. Obviously, an atomic ≤-formula is represented using a ⟨x ≤ m⟩ gadget. A conjunction φ 1 ∧ φ 2 is represented by a sum of two gadgets f 1 + f 2 where f i represents φ i . For this to work, we need the properties that all used gadgets are positive and that their output is 0 when satisfied. To represent a disjunction φ 1 ∨φ 2 where f 1 and f 2 are the gadgets representing φ 1 resp. φ 2 , we need the fact that L is a DGLP. W.l.o.g. suppose that φ 1 only contains discrete variables and that φ discr is satisfied. Then we get: if φ 1 is not satisfied then the output of f 1 must be greater or equal to 1. The reason for this is the following. If the property of some ⟨x ≤ m⟩-gadget is not satisfied its output must be 1, still under the assumption that its input includes discrete variables only. Furthermore, as ⟨x ≤ m⟩ is positive and upwards bounded, the value of f 2 must be bounded by some value k ∈ R >0 . Therefore, we can represent the disjunction using re(f 2 -k re(1 -f 1 )). Note that this advanced gadget is also positive and upwards bounded. Again, the value of y 1 cond is checked in the readout layer using y 1 cond = 0 . The graph condition ψ is represented using a sum of ⟨x ≤ m⟩ gadgets. Thus, we can effectively translate a DGLP L into an MPNN N L such that there is a graph G with N L (G) = 0 if and only if G satisfies L, i.e. L ∈ DGLP. This transfers the undecidability from DGLP to GCP. Proof of Theorem 1. The statement is an immediate consequence of the results of Theorem 4 and Lemma 2. The proof for Corollary 1 follows the exact same line of arguments, but we consider the following decision problem: given a graph-classifier N with two output dimension, the question is whether there is graph G such that (N (G) 1 ≤ N (G) 2 ) ∧ (N (G) 1 ≤ N (G) 2 ). We call this GCP ≤ . Obviously, the statement of Lemma 2 also holds for GCP ≤ and ORP graph (Φ bound , Ψ leq ). Proving that GCP ≤ is undecidable is also done via reduction from DGLP with only minimal modifications of MPNN N L constructed in the proof of Theorem 4: we add a second output dimension to N L which constantly outputs 0. The correctness of the reduction follows immediately.

5. THE POSSIBILITY OF FORMALLY VERIFIYING ORP AND ARP OF NODE-CLASSIFIER MPNN OVER DEGREE BOUNDED GRAPH CLASSES

In order to prove Theorem 2, we argue that there is a naive algorithm verifying ORP node (Φ bound , Ψ conv ) formally. Consider a node-classifier N with k layers and consider some graph G with specified node v such that N (G, v) = y. The crucial insight is that there is a tree B of finite depth k and with root v 0 such that N (B, v 0 ) = y. The intuitive reason for this is that N can update the label of node v using information from neighbours of v of distance at most k. For example, assume that k = 2 and G, v are given as shown on the left side of Figure 4 where the information of a node, given by its label, is depicted using different colours y (yellow), b (blue), r (red), g (green) and p (pink). As N only includes two layers, information from the unfilled (white) nodes are not relevant for the computation of N (G, v) as their distance to v is greater than 2. Take the tree B on the right side of Figure 4 . We get that N (G, v) = N (B, v 0 ). Proof of Theorem 2. First, we observe the tree-model property for node-classifier MPNN over graphs of bounded degree: let (N, φ, ψ) ∈ ORP node (Φ bound , Ψ conv ) where N has k ′ layers and φ bounds valid graphs to degree d ∈ N and constraints nodes in the k ′′ -neighbourhood of the center node. We have that (N, φ, ψ) holds if and only if there is a d-tree B of depth k = max(k ′ , k ′′ ) with root v 0 such that (B, v 0 ) |= φ and N (B, v 0 ) |= ψ. We prove this property in Appendix B.3. We fix the ORP (N, φ, ψ) as specified above and assume that comb i as well as the readout function read of N are given by the NN N 1 , . . . , N k ′ , N r where N 1 has input dimension 2 • m and output dimension n. Furthermore, assume that φ bounds valid graphs to degree d ∈ N. For each unlabeled tree B = (V, D, v 0 ) with V = {v 0 , . . . , v l } of degree at most d and depth k, of which there are only finitely many, the verification algorithm A works as follows. By definition, the MPNN N applied to B computes N 1 (x v , v ′ ∈Nv x v ′ ) as the new label for each v ∈ V after layer l 1 . However, as the structure of B is fixed at this point we know the neighbourhood for each node v. Therefore, A constructs NN N 1 with input dimension l • m and output dimension l • n 1 given by (N 1 (id(x v0 ), id ( v ′ ∈Nv 0 (x v ′ )), . . . , N 1 (id(x v l ), id( v ′ ∈Nv l (x v ′ )) )), representing the whole computation of layer l 1 , where id(x) := re(re(x)-re(-x)) is a simple gadget computing the identity. In the same way A transforms the computation of layer l i , i ≥ 2, into a network N i using the output of N i-1 as inputs. Then, A combines N l and N r , by connecting the output dimensions of N l corresponding to node v 0 to the input dimensions of N r , creating an NN N representing the computation of N over graphs of structure B for arbitrary labeling functions L. This construction reduces the question of whether (N, φ, ψ) holds to the following question: are there labels for v 0 , . . . , v l , the input of N , satisfying the constraints given by φ such that the output of N satisfies ψ. As the label constraints of φ and the specification ψ are defined by conjunctions of linear inequalities this is in fact an instance of the output reachability problem for NN, which is known to be decidable, as shown by Katz et al. (2017) or Sälzer & Lange (2021) . Therefore, A incorporates a verification procedure for ORP of NN and returns ⊤ if the instance is positive and otherwise considers the next unlabeled d-tree of depth k. If none has been found then it returns ⊥. The soundness and completeness of A follows from the tree-model property, the exhaustive loop over all candidate trees and use of the verification procedure for output reachability of NN.

6. SUMMARY AND APPLICABILITY OF RESULTS

This work presents two major results: we proved that formal verification of ORP and ARP of graphclassifier MPNN is not possible as soon as we consider parts of the input space, containing graphs of unbounded, but finite size, non-trivial degree and sufficiently expressive labels. We also showed that formal verification of ORP and ARP of node-classifier MPNN is possible, as soon as the degree of the considered input graphs is bounded. These results can serve as a basis for further research on formal verification of GNN but their extendability depends on several parameters. Dependency on the GNN model. We restricted our investigations to GNN from the MPNN model, which is a blueprint for spatial-based GNN (Wu et al. (2021) ). However, the MPNN model does not directly specify how the aggregation, combination and readout functions are represented. Motivated by common choices, we restricted our considerations to MPNN where the aggregation functions compute a simple sum of their inputs and the combination and readout functions are represented by NN with ReLU activation only. Theorem 1 and Corollary 1 only extend to GNN models that are at least as expressive as the ones considered here. For some minor changes to our setting, like considering NN with other piecewise-linear activation functions, it is easily argued that both results still hold. However, as soon as we leave the MPNN or spatial-based model the question of formal verifiability opens anew. Bridging results about the expressiveness of GNN from different models, for example spatial-vs. spectral-based, is ongoing research like done by Balcilar et al. (2021) , and it remains to be seen which future findings on expressiveness can be used to directly transfer the negative results about the impossibility of formal verification obtained here. Analogously, Theorem 2 and Corollary 2 only extend to GNN that are at most as expressive as the ones considered here. It is not possible, for example, to directly translate these results to models like DropGNN (Papp et al. (2021) ), which are shown to be more expressive than MPNN. Hence, this also remains to be investigated in the future. Dependency on the specifications. Obviously, the results presented here are highly dependent on the choice of input as well as output specifications (see Appendix A for details). We refer to future work for establishing further (im-)possibility results for formal verification of ORP and ARP of GNN, with the ultimate goal of finding tight bounds.

A DETAILS ON IMPORTANT SETS OF SPECIFICATIONS

To prove the results stated in the main part of the paper, we need to work with a formally defined syntax for each kind of specification considered here. However, it should be clear that the results presented in Section 3 do not depend on the exact syntactic form, but on the expressibility of the considered kind of specifications. Vector Specifications Ψ conv . Motivated by common choices in formal verification of classical NN, we often use the following form: a vector specification φ for a given set of variables X is defined by the grammar φ ::= φ ∧ φ | t ≤ b , t ::= c • x | t + t where b, c ∈ Q and x ∈ X is a variable. A vector specification φ with occurring variables x 0 , . . . , x n-1 is satisfied by x = (r 0 , . . . , r n-1 ) ∈ R n if each inequality in φ is satisfied in real arithmetic with each x i set to r i . We denote the set of all such specifications by Ψ conv . A vector specification that also includes ∨ and < operators is called extended. vector specifications are not included in Ψ conv . Graph specifications from Φ unb . In the arguments of Section 4 and Appendix B.1 we refer to any set of graph specifications which contains a specification φ satisfiable by finite graphs of arbitrary size and degree as well as arbitrary labels, as Φ unb , for instance a specification like φ = true. However, as indicated in Section 3, this weak form of Φ unb is not necessary. The arguments for Theorem 1, given in Section 4 and Appendix B.1, are also valid if Φ unb contains at least one specification φ, satisfiable by graphs of arbitrary size, degree 4 (indicated by Figure 3 ) and labels expressive enough to represent those used in PCP-structures (Appendix B.1), mainly labels using positive integer values. For example, φ(G) = deg(G) ≤ 4 ∧ ∀v ∈ V.ψ PCP (v) , where ψ PCP is an extended vector specification, checking if the label of a node is valid for a PCP-structure. However, the exact conditions are highly dependent on the construction used in the reduction from PCP to GLP and can easily be optimised. But the aim of this work is to show that there are fundamental limits in formal verification of ORP (and ARP), and optimising the undecidability results presented in Section 4 in this way would only obscure the understanding of such limits. Thus, we use the above described weaker Φ unb in the formal parts, leading to uncluttered arguments and proofs. Graph-node specifications from Φ bound . First, we define for each d, k ∈ N a set Φ d,k bound of graph- node specifications. Φ d,k bound is the set graph-node specifications φ bounding the degree of satisfying graphs to d and constraining only nodes in the k-neighbourhood of the center node using vector specifications, for instance φ(G, v) = deg(G) ≤ 4 ∧ ∀v ′ ∈ Neigh k (v).ψ(v ′ ) where Neigh k includes all nodes of distance up to k of v and ψ is a vector specification. Then, Φ bound = d,k∈N Φ d,k bound . Graph or graph-node specifications from Φ neigh . We consider Φ neigh as a set of graph specifications or a set of graph-node specifications. Φ neigh consists of graph or graph-node specifications φ, given by some fully-defined center graph G (or pair (G, v)) and a finite modification-budget B. A finite modification budget specifies a bounded number of structural modifications, namely inserting or deleting nodes and edges, as well as allowed label modifications of nodes in G, bounded by vector specifications, for instance φ = (G, B) or φ = ((G, v), B). Then, a graph or graph-node pair satisfies φ if it can be generated from G respectively (G, v) respecting the bounded budget B. Vector specifications Ψ eq . The set Ψ eq consists of vector specifications of the form x i = b, thus, vector specifications expressing that a single dimension is equal to some fixed, rational value. Extended vector specifications Ψ leq , Ψ class . The Ψ leq consists of extended vector specifications of the form i∈I j∈I\{i} x i ≤ x j . Analogously, Ψ class consists of extended vector specifications of the form i∈I j∈I\{i} x i > x j . Note that for the argument of Corollary 1 it is sufficient that (x 1 ≤ x 2 ) ∧ (x 2 ≤ x 1 ) is included in Ψ leq . B PROOF DETAILS

B.1 PROVING THAT GLP AND DGLP ARE UNDECIDABLE

We use the following abbreviations for GLP. For a set C of colours we define colour (C) = C (x c = 0) ∨ (x c = 1) and exactly one(C) = C (c → ( c ′ ̸ =c ¬c ′ )) ∧ (¬c → c ′ ̸ =c c ′ ) where c := (x c = 1), ¬c := (x c = 0), c → φ := (x c = 0) ∨ φ and ¬c → φ := (x c = 1) ∨ φ. We use → as having a weaker precedence than all other GLP operators. To keep the notation clear we denote -if unambiguous -some variable x i in node and graph conditions by its index i. Let G = (V, D, L) be a graph. For some node set V ′ ⊆ V and node v we define Neigh v (V ′ ) = Neigh(v) ∩ V ′ . We call a subset of nodes V ′ = {v 1 , . . . , v k } ⊆ V, k ≥ 2, a chain if Neigh v1 (V ′ ) = {v 2 }, Neigh vi (V ′ ) = {v i-1 , v i+1 } for 2 ≤ i ≤ k -1 and Neigh v k (V ′ ) = {v k-1 }. We call v 1 start, v i a middle node and v k end of V ′ and assume throughout the following arguments that index 1 denotes the start and the maximal index k denotes the end of a chain. Let V 1 = {v 1 , . . . , v k } and V 2 = {u 1 , . . . , u k } be subsets of V and both be chains. We say that V 1 ∪V 2 is a ladder if for all v i , u i we have Neigh vi (V 1 ∪ V 2 ) = Neigh vi (V 1 ) ∪ {u i } and Neigh ui (V 1 ∪ V 2 ) = Neigh ui (V 2 ) ∪ {v i }. First, we show that DGLP can recognise graphs G that consist of exactly one ladder and one additional chain. If this is the case we call G a chain-ladder. Let C 3 = {c 1 , c 2 , c 3 }, T = {(c, s), (c, m), (c, e) | c ∈ C 3 } be sets of of symbols we call colours. Let (φ CL , ψ CL ) be the following DGLP over variables Var CL = {x c | c ∈ C 3 ∪ T} ∪ {x c,id , x c,e,id | c ∈ C 3 }: φ CL :=φ cond ∧ colour(C 3 ∪ T) ∧ M∈{C3,T} exactly one(M) φ cond := C3 (¬c i → T ¬(c i , t) ∧ (c i , id) = 0 ∧ (c i , e, id) = 0) ∧ (c i → ⊙c i = 1 ∨ ⊙c i = 2) ∧ ((c i , s) → ⊙c i = 1 ∧ (c i , id) = 1 ∧ ⊙(c i , id) = 2 ∧ (c i , e, id = 0)) ∧ ((c i , m) → ⊙c i = 2 ∧ 2(c i , id) = ⊙(c i , id) ∧ (c i , e, id = 0)) ∧ ((c i , e) → ⊙c i = 1 ∧ (c i , id) ≤ ⊙(c i , id) -1 ∧ (c i , e, id) = (c i , id)) ∧ (c 1 → ⊙c 2 = 1 ∧ (c 1 , id) = ⊙(c 2 , id)) ∧ (c 2 → ⊙c 1 = 1 ∧ (c 2 , id) = ⊙(c 1 , id)) ψ CL := C3 (c i , s) = 1 ∧ (c i , e) = 1 ∧ c i = (c i , e, id) Lemma 3. If G = (V, D, L) satisfies (φ CL , ψ CL ) then G is a chain-ladder and if G ′ = (V ′ , D ′ ) is an unlabeled chain-ladder then there is L ′ such that G = (V ′ , D ′ , L ′ ) satisfies (φ CL , ψ CL ). Proof. Assume that G satisfies (φ CL , ψ CL ). By definition, it follows that all nodes v ∈ V satisfy φ CL and G satisfies ψ CL . Let v ∈ V be a node. Due to M∈{C3,T} exactly one(M) ∧ colour(C 3 ∪ T) we have that v has exactly one colour c 1 , c 2 or c 3 and exactly one from T. Furthermore, the subformula C3 (¬c i → T ¬(c i , t) ∧ • • • implies that there is i ∈ {1, 2, 3} such that v is of colour c i and (c i , t) for some t. We divide V into three sets V 1 , V 2 and V 3 such that v ∈ V i if and only if v is of colour c i and argue that each V i is a chain. Note that the V i are disjunct sets. Let v ∈ V i . The subformula (c i → ⊙c i = 1 ∨ ⊙c i = 2) implies that v has 1 or two neighbours from V i . From the argument above, we know that v must be of exactly one colour (c i , s), (c i , m) or (c i , e). The → subformulas in φ cond regarding these three colours imply: if v is of colour (c i , s) or (c i , e) it must have exactly one neighbour from V i and if v is of colour (c i , m) it must have exactly two neigbours from V i . The graph condition ψ CL implies that there is exactly one node with colour (c i , s) and one with colour (c i , e). In combination, we have that there is a start v s and end v e in V i both having one neighbour in V i and all middle nodes v m having two. Next, consider the (c i , id) and (c i , e, id) label dimensions. We call (c i , id) the id of a node with colour c i . The subformula (¬c i → • • • ∧ (c i , id) = 0 ∧ (c i , e, id) = 0) ∧ • • • ) implies that if a node is not of colour c i then the corresponding dimensions must be 0 and ((c i , s) → • • • ∧ (c i , e, id = 0)) and ((c i , m) → • • • ∧ (c i , e, id = 0)) imply that if it is not an end node then (c i , e, id) is 0 as well. Next, we see in the subformula ((c i , s) → • • • ∧ (c i , id) = 1 ∧ ⊙(c i , id) = 2 ∧ • • • ) that v s has id 1 and its neighbour has id 2. This implies that the only neighbour of v s is not itself. The same holds for v e due to the subformula ((c i , e) → • • • ∧ (c i , id) ≤ ⊙(c i , id) -1). Furthermore, the subformula ((c i , e) → • • • ∧ (c i , e, id) = (c i , id)) implies that the id of v e is stored in (c i , e, id). This is used in the graph condition subformula c i = (c i , e, id) to ensure that the amount of nodes in V i is equal to the id of v e . We make a case distinction: if v e is the neighbour of v s then the id of v e is 2 and, thus, V 1 = {v s , v e } which obviously is a chain. If v e is not the neighbour of v s then it must be some v m . The subformula ((c i , m) → • • • ∧ 2(c i , id) = ⊙(c i , id) ∧ • • • ) implies that v m is not its own neighbour and that the other neighbour v ′ m must have id 3. Now, if v ′ m = v e then we can make the same argument as in the other case. If not then we get that v ′ m must have a neighbour v ′′ m ̸ = v ′ m . The node v ′′ m must have id 4 and, thus, it did not occur earlier on the chain. As V i is finite, this sequence must eventually reach v e and we get that V i must be a chain. So far, we argued that V = V 1 ∪ V 2 ∪ V 3 with V i disjunct and chains. It is left to argue that V 1 ∪ V 2 forms a ladder. From our previous arguments we know that the nodes of V i have incrementing ids from v s to v e starting with 1. Therefore, the ladder property is ensured by the subformulas (c 1 → ⊙c 2 = 1 ∧ (c 1 , id) = ⊙(c 2 , id)) and (c 2 → ⊙c 1 = 1 ∧ (c 2 , id) = ⊙(c 1 , id)). The other statement of the lemma, namely that there is a labeling function L ′ for G ′ such that (φ CL , ψ CL ) is satisfied, is a straightforward construction of L ′ following the arguments above. Let G be a chain-ladder with ladder V 1 ∪ V 2 = {v 1 , . . . , v k } ∪ {u 1 , . . . , u k } and chain V 3 = {w 1 , . . . , w l }. We call G a PCP-structure if for all w i we have Neigh wi (V 1 ∪ V 2 ∪ V 3 ) = Neigh wi (V 3 ) ∪ {v hi , u ji } for some h i , j i ∈ {1, . . . , k} such that for all 2 ≤ i ≤ l -1 we have that h i-1 < h i < h i+1 and j i-1 < j i < j i+1 . Intuitively, this property ensures that connections from chain V 3 to V 1 or V 2 do not intersect. We show that there is a DGLP that recognizes PCP-structures. Let L, M, R be colours, called directions. We define L + 1 := M, M + 1 := R, R + 1 := L and direction -1 symmetrically. Let C 3 be as above and F = {(c, d) | c ∈ C 3 , d ∈ {L, M, R}}. The DGLP (φ PS , ψ PS ) over the variables Var PS = Var CL ∪ {x c | c ∈ F} ∪ {x d,ci,id | d ∈ {L, M, R}, c i ∈ {c 1 , c 2 }} is defined as follows: φ PS :=φ cond ∧ colour(F) ∧ exactly one(F) ∧ φ CL φ cond :=( C3 ¬c i → F ¬(c i , d)) ∧ C3 ((c i , s) → (c i , L) ∧ ⊙(c i , M ) = 1) ∧ ((c i , m) → F (c i , d) ∧ d ′ ̸ =d ⊙(c i , d ′ ) = 1) ∧ ( {c1,c2} c i → ⊙c 3 ≤ 1) ∧ (c 3 → ⊙c 1 = 1 ∧ ⊙c 2 = 1) ∧ F (¬(c 3 , d) → {c1,c2} (d, c i , id) = 0) ∧ ((c 3 , d) → {c1,c2} ⊙(c i , id) = (d, c i , id)) ∧ F ((c 3 , d) → {c1,c2} ⊙(d -1, c i , id) ≤ (d, c i , id) ∧ (d, c i , id) ≤ ⊙(d + 1, c i , id)) ψ PS :=ψ CL Lemma 4. If G = (V, D, L) satisfies (φ PS , ψ PS ) then G is a PCP-structure and if G ′ = (V ′ , D ′ ) is an unlabelled PCP-structure then there is labelling function L ′ such that (V ′ , D ′ , L ′ ) satisfies (φ PS , ψ PS ). Proof. Assume that G satisfies (φ PS , ψ PS ). As φ CL occurs as a conjunct in φ PS and ψ CL in ψ PS Lemma 3 implies that G is a chain-ladder. Let V 1 ∪ V 2 be the ladder and V 3 the chain. The subformula exactly one(F) and colour(F) in combination with ( C3 ¬c i → F ¬(c i , d)) imply that a node is of colour c i if and only if it is of exactly one color (c i , d). From the arguments of Lemma 3 we know that each node v has exactly one colour c i and, thus, v also has a corresponding direction d ∈ {L, M, R}. The subformulas ((c i , s) → (c i , L) ∧ ⊙(c i , M ) = 1) and ((c i , m) → F (c i , d) ∧ d ′ ̸ =d ⊙(c i , d ′ ) = 1) imply that start node of chain V i has direction L and its neighbour M and that the neighbours of each middle node of direction d, characterised by colour (c i , m), must have directions d -1 and d + 1. In combination, this implies that each chain V 1 , V 2 and V 3 is coloured from start to end with the pattern (L, M, R) * . The subformulas ( {c1,c2} c i → ⊙c 3 ≤ 1) and (c 3 → ⊙c 1 = 1 ∧ ⊙c 2 = 1) imply that nodes from ladder V 1 ∪ V 2 have at most one neighbour from V 3 and each node from chain V 3 has exactly one neighbour from V 1 and one from V 2 . Consider the dimensions (d, c i , id). First, the subformula (¬(c 3 , d) → {c1,c2} (d, c i , id) = 0) and the conditions of φ CL imply that dimension (d, c i , id) of node v are nonzero only if v is from V 3 and of direction d. The subformula ((c 3 , d) → {c1,c2} ⊙(c i , id) = (d, c i , id)) leads to the case that each node v ∈ V 3 of direction d has stored the id of its one neighbour from V 1 in (c 1 , d, id) and the id of its one neighbour from V 2 in (c 2 , d, id). Now, the subformulas ((c 3 , d) → {c1,c2} ⊙(d -1, c i , id) ≤ (d, c i , id)) and ((d, c i , id) ≤ ⊙(d + 1, c i , id)) imply the main property of a PCP-structure, namely that the connections between V 3 and V 1 as well as V 2 are not intersecting. Note that ≤ is sufficient as each node from V 1 and V 2 can have at most 1 neighbour from V 3 . Finally, we are set to prove that DGLP is undecidable. Let P = {(α 1 , β 1 ), . . . , (α k , β k )} be a PCP instance over alphabet Σ = {a, b} and let m = max( Proof of Theorem 3. We prove this via reduction from PCP. Let P = {(α 1 , β 1 ), . . . , (α k , β k )} and (φ P , ψ P ) be like above. Assume that (φ P , ψ P ) is satisfied by G. k i=1 {|α i |, |β i |}). Let B = {(c i , d, a, j), (c i , d, b, j) | c i ∈ {c 1 , c 2 }, d ∈ {L, M, R}, j ∈ {0, . . . , m -1}} and S = {(c i , d, p, j) | c i ∈ {c 1 , c 2 }, d ∈ {L, M, From the describtion above, we can see that φ PS and ψ PS are conjunctive subformulas of φ P respectively ψ P . Therefore, Lemma 4 implies that G is a PCP-structure. Let V 1 ∪ V 2 be the ladder and V 3 the additional chain in G. In addition to the colours resulting from φ PS , the subformulas colour(B ∪ S) and ( {c1,c2} c i → M∈{B 0 ,S 0 } exactly one(M)) ensure that B and S are colours and that each ladder node has exactly one colour from B 0 ⊂ B and S 0 ⊂ S. The idea of these colours is the following: a colour (c i , d, p, j) ∈ B with p ∈ {a, b} and j ∈ {0, . . . , m -1} represents the symbol (a or b) of a node in distance j of a node coloured with (c i , d). Similarly, colour (c i , d, p, j) with p ∈ {1, . . . , k, ⊥, e} and j ∈ {0, . . . , m} represents that a node in distance j of a node coloured with (c i , d) is the start of tilepart α p if i = 1, p ̸ = ⊥, e and β p if i = 2, p ̸ = ⊥, e. In case of p = e the node in distance j is the end node of chain V i and p = ⊥ is a placeholder for nodes which are neither a start of some tilepart nor the end node. The case j = 0 is interpreted as its own symbol or start of a tile part. We argue how φ P ensures the above mentioned properties of colours (c i , d, p, j) ∈ B ∪ S. The subformula (¬(c i , d) → B∪S ¬(c i , d, p, j)) ensures that a node of some colour (c i , d, p, j) must also be of colour (c i , d). Especially, this implies that nodes from chain V 3 do not have any colour (c i , d, p, j). The subformulas ((c i , d) → B,j< m-1 (c i , d, p, j + 1) = ⊙(c i , d + 1, p, j)) and ((c i , d) → S,j< m(c i , d, p, j + 1) = ⊙(c i , d + 1, p, j)) ensure that a node with colour (c i , d) stores the information p, j of its (c i , d + 1) neighbour in form of its own colour (c i , d, p, j + 1). Note that, each chain is labeled wird L, M, R, L, . . . from start to end and, thus, the d+1 neighbour is the right neighbour in the sense that its nearer to the end node v e . To understand how this leads to the case that each node on chain V i stores the information of its m right neighbours, we argue beginning from end v e of chain V i . Subformula ((c i , e) → B∪S,p̸ =e ¬(c i , d, p, j) ∧ F (c i , d, e, 0)) ensures that v e only has colour (c i , d, e, 0). That d matches its colour (c i , d) is ensured by ¬(c i , d) → • • • . Therefore, its only and left neighbour v must have colour (c i , d -1, e, 1) plus its own additional colours with j = 0. Now, the left neighbour v ′ of v must have colours (c 1 , d -2, e, 2), the colours equivalent to v with j = 1 and its own colours with j = 0 and so on. As the maximum j in case of a colour from S is m, tilepart start, end or ⊥ colours are stored in nodes up to distance m to the left of the original node. The same holds for colours from B with distance m -1.



In other words, the problem of determining, given an MPNN N and descriptions of valid inputs and outputs, whether the corresponding property holds, is decidable. We assume that model checking of specifications from Φ bound is decidable. Otherwise, verification of corresponding ORP becomes undecidable due to trivial reasons.



Figure 2: A solution for the PCP instance P 0 .

Figure 3: Encoded solution I of PCP instance P 0 .

Figure 4: Tree-model property of a two-layered MPNN.

R}, p ∈ {1, . . . , k, e, ⊥}, j ∈ {0, . . . , m}} colours. Additionally, we defineS ⊤ = S \ {(c i , d, ⊥, j) | (c i , d, ⊥, j) ∈ S}, S 0 = {(c i , d, p, 0) | (c i , d, p, 0) ∈ S} and B 0 = {(c i , d, p, 0) | (c i , d, p, 0) ∈ B}.We define the following DGLP (φ P , ψ P ) over the variables Var PS ∪ {x c | c ∈ B ∪ S}:φ P :=φ cond ∧ ( {c1,c2} c i → M∈{B 0 ,S 0 } exactly one(M)) ∧ φ PS ∧ colour(B ∪ S) φ cond := ∧ F (¬(c i , d) → B∪S ¬(c i , d, p, j)) ∧ F (c i , d) → B,j< m-1 (c i , d, p, j + 1) = ⊙(c i , d + 1, p, j)) ∧ F (c i , d) → S,j< m(c i , d, p, j + 1) = ⊙(c i , d + 1, p, j)) ∧ {c1,c2} (c i , e) → B∪S,p̸ =e ¬(c i , d, p, j) ∧ F (c i , d, e, 0) ∧ {c1,c2} ¬(c i , e) → F ¬(c i , d, e, 0) ∧ S ⊤ ,p̸ =e (c 1 , d, p, 0) → ⊙c 3 = 1 ∧ ( |αp|-1 j=0 (c 1 , d, α p [j], j) ∧ (c 1 , d, ⊥, j)) ∧ k p ′ =1 (c 1 , d, p ′ , |α p |) ∨ (c 1 , d, e, |α p |) ∧ S ⊤ ,p̸ =e (c 2 , d, p, 0) → ⊙c 3 = 1 ∧ ( |βp|-1 j=0 (c 2 , d, β p [j], j) ∧ (c 2 , d, ⊥, j)) ∧ k p ′ =1 (c 2 , d, p ′ , |β p |) ∨ (c 2 , d, e, |β p |) ∧ {c1,c2} (c i , s) → S ⊤ (c i , d, p, 0) ∧ {L,M,R} ((c 1 , d) → (c 1 , d, a, 0) = ⊙(c 2 , d, a, 0) ∧ (c 1 , d, b, 0) = ⊙(c 2 , d, b, 0)) ∧ c 3 → ( S 0 d ′ ∈{L,M,R} ⊙(c 1 , d, p, 0) = ⊙(c 2 , d ′ , p, 0)) ∧ S ⊤ ⊙(c 1 , d, p, 0) = 1 ψ P :=ψ PS

. The horizontal, resp. vertical axis represents sets

annex

We are set to argue that G encodes a solution I of P . The subformula ( S ⊤ ,p̸ =e (c 1 , d, p, 0) →ensures that for each node from V 1 that is a tilepart start for some α p that α p is written to the right without a next tilepart starting (( |αp|-1 j=0 (c 1 , d, α p [j], j) ∧ ∧(c 1 , d, ⊥, j))) and that after α p is finished that either the next tile part starts or the chain ends ( Analogous conditions are ensured for nodes from V 2 by the subformula. Now, the subformula ( {c1,c2} (c i , s) → S ⊤ (c i , d, p, 0)) ensures that the start nodes of V 1 and V 2 correspond to a tilepart start. In combination with the previous conditions, this ensures that chains V 1 and V 2 are coloured with words w α and w β corresponding to sequences) and the fact that G is a PCP-structure which means that connections between V 3 and V 1 respectively V 2 are not intersecting. Therefore, the sequence i j • • • i h = j 1 • • • j l is a solution for P which implies that P is solvable.The vice-versa direction, namely that if P is solvable then (φ P , ψ P ) is satisfiable, is argued easily: If P is solvable then there is a solution I. Figure 3 indicates how to encode I as a PCP-structure G. Note that in contrast to the visualisation, the encoding characterized by φ PS demands that the end nodes of chain V 1 and V 2 are not part of solution I. Lemma 4 states that for each unlabeled PCP-structure there is a labeling function L ′ such that G satisfies (φ PS , ψ PS ). Therefore, if we take a matching PCP-structure G without labels, label it with L ′ and then extend L ′ with the colours (c i , d, p, j) according to I and the arguments above we get that G satisfies (φ P , ψ P ).We can see from the definitions of φ CL , φ PS and φ P and corresponding graph conditions that they belong to the DGLP fragment of GLP. This proves the statement of Corollary 3. Consider the ⟨x ∈ M⟩ gadget and assume that r ∈ M. It clearly holds that r ∈ [i 1 ; i k ] and therefore that re(⟨x-r) for j ̸ = l and, thus, the inner sum is equal to 0 as well. Now, assume that r ̸ ∈ M. If r < i 1 or r > i k it follows that re(⟨xmust be the case that r ∈ (i j ; i j+1 ) for some i ≤ j < k and therefore that re(That the gadget ⟨x ∈ M⟩ is positive is obvious as the outermost function is ReLU.

B.3 PROVING THE TREE-MODEL PROPERTY OF NODE-CLASSIFIER MPNN OVER BOUNDED GRAPHS

In the proof of Theorem 2 we claim that node-classifier MPNN have the tree-model property. We formally prove this statement in the following.Published as a conference paper at ICLR 2023 Let G = (V, D, L) be a graph and v ∈ V. The set of straight i-paths P i v of v is defined asLet N be a node-classifier MPNN. We denote the value of v after the application of layerProof. Assume that (N, φ, ψ) is as stated above. The direction from right to left is straightforward. Therefore, assume that (N, φ, ψ) holds. By defintion, there exists a d-graphPk w (i). The set of edges D B is given in the obvious way by {((v, p), (v ′ , pv ′ )) | (v, p), (v ′ , pv ′ ) ∈ V B } and closed under symmetrie. Note that from the definition of Pk w follows that B is a well-defined d-tree of depth k. The labeling function L B is defined such that L B ((v, p)) = L(v) for all (v, p) ∈ V B . From its construction follows that (B, (w, w)) |= φ.We show that it holds that N k ′ (B, (w, w)) = N k ′ (G, w) which directly implies that N (B, (w, w)) = N (G, w). We do this by showing the following stronger statement for all j = 0, . . . , k ′ via induction: for all (v, p) ∈ Pk ′ w (k ′ -i) with k ′ ≥ i ≥ j holds that N j (B, (v, p)) = N j (G, v). The case j = 0 is obvious as L B is defined equivalent to L. Therefore assume that the statement holds for j)) and, thus, (v ′ , v) ∈ D or (v, v ′ ) ∈ D which in both cases means v ′ = Neigh(v). The induction hypothesis implies that N j (B, (v ′ , p ′ )) = N j (G, v ′ ). As these arguments hold for all (v ′ , p ′ ) ∈ Neigh((v, p)) we get that Neigh((v,p)) N j (B, (v ′ , p ′ )) ≤ Neigh(v) N j (G, v ′ ). The vice-versa direction is argued analogously which then implies that Neigh((v,p)) N j (B, (v ′ , p ′ )) = Neigh(v) N j (G, v ′ ). From the induction hypothesis we get that N j (B, (v, p)) = N j (G, v). Then, the definition of a MPNN layer implies that N j+1 (B, (v, p)) = N j+1 (G, v). Therefore, the overall statement holds for all j ≤ k ′ and by taking j = i = k ′ we get the desired result.

