HARNESSING SPECTRAL REPRESENTATIONS FOR SUBGRAPH ALIGNMENT

Abstract

With the rise and advent of graph learning techniques, graph data has become ubiquitous. However, while several efforts are being devoted to the design of new convolutional architectures, pooling or positional encoding schemes, less effort is being spent on problems involving maps between (possibly very large) graphs, such as signal transfer, graph isomorphism and subgraph correspondence. With this paper, we anticipate the need for a convenient framework to deal with such problems, and focus in particular on the challenging subgraph alignment scenario. We claim that, first and foremost, the representation of a map plays a central role on how these problems should be modeled. Taking the hint from recent work in geometry processing, we propose the adoption of a spectral representation for maps that is compact, easy to compute, robust to topological changes, easy to plug into existing pipelines, and is especially effective for subgraph alignment problems. We report for the first time a surprising phenomenon where the partiality associated to the subgraph is manifested as a special structure of the map coefficients, even in the absence of exact subgraph isomorphism, and which is consistently observed over different families of graphs up to several thousand nodes.

1. INTRODUCTION

The ability to align data is at the heart of many successful techniques in machine learning and related areas. In its most abstract form, the problem has a straightforward formulation: Given two generic domains D 1 and D 2 , find a transformation T such that T D 1 ≈ D 2 according to some approximation metric that depends on the task. Examples of such problems are found in numerous applications, including molecular docking (Gainza et al., 2020 ), image-based rendering (Fachada et al., 2021 ) , 3D reconstruction (Zhao et al., 2022) , generative models (Dai & Hang, 2021) and style transfer (Zhang et al., 2022) , in addition to countless others. Recent remarkable examples include CLIP Meila & Zhang (2021) , where images are associated to corresponding captions by aligning their learned embeddings, or MaSIF (Gainza et al., 2020) , where the interaction site between protein structures (i.e., the surface patches where the proteins geometrically align) is predicted by a geometric deep learning pipeline. Perhaps the most challenging setting for alignment problems arises whenever the two domains only correspond partially, for example due to the lack of observations or noise in the data. In this case, one is not only interested in aligning the two domains, but also in discovering which portions of the domains actually align. The problem is particularly hard if an exact alignment does not even exist, requiring additional robustness to local perturbations in the data. In this paper, we focus on the general problem of subgraph alignment, as it is representative of a broad spectrum of applications including those mentioned above. We assume to be given two graphs G 1 and G 2 , where G 2 appears within G 1 , possibly up to topological changes. A special case appears when G 2 is isomorphic to a subgraph of G 1 , which is referred to as subgraph isomorphism (see (ii) in Figure 1 ). This case is included in our treatment, but we also consider noisier settings where a subgraph isomorphism does not exist (see (iii) in Figure 1 ), yet a semantic correspondence can still be defined.

Contribution.

In this paper, we focus in particular on the choice of a representation for the correspondence. That is, instead of introducing a new matching pipeline to solve subgraph alignment, we

