A 2-PARAMETER PERSISTENCE LAYER FOR LEARNING

Abstract

1-parameter persistent homology, a cornerstone in Topological Data Analysis (TDA), studies the evolution of topological features such as connected components and cycles hidden in data. It has found its application in strengthening the representation power of deep learning models like Graph Neural Networks (GNN). To enrich the representations of topological features, here we propose to study 2-parameter persistence modules induced by bi-filtration functions. In order to incorporate these representations into machine learning models, we introduce a novel vectorization on 2-parameter persistence modules called Generalized Rank Invariant Landscape (GRIL). We show that this vector representation is 1lipschitz (stable) and differentiable with respect to underlying filtration functions and can be easily integrated into machine learning models to augment encoding topological features. We present an algorithm to compute the vectorization and its gradients. We also test our methods on synthetic graph datasets and benchmark graph datasets, and compare the results with previous vector representations of 1-parameter and 2-parameter persistence modules

1. INTRODUCTION

Machine learning models such as and Graph Neural Networks (GNNs) (Gori et al., 2005; Scarselli et al., 2009; Kipf & Welling, 2017; Xu et al., 2019) are well-known successful tools from the geometric deep learning community. The representation power of such models can be augmented by infusing topological information as some vector representation of persistent homology of the underlying space hidden in data. Many recent works have successfully integrated topological information with machine learning models. (Carrière et al., 2020; Kim et al., 2020; Gabrielsson et al., 2020; Hofer et al., 2020; Horn et al., 2022; Swenson et al., 2020; Bouritsas et al., 2022; Corbet et al., 2019; Carrière & Blumberg, 2020; Vipond, 2020) . In most of these works, the authors use 1-parameter persistence homology as the topological information. However, in (Corbet et al., 2019; Vipond, 2020; Carrière & Blumberg, 2020) , the authors use vector representations of 2-parameter persistence modules. In (Carrière & Blumberg, 2020) and (Corbet et al., 2019) , these representations are based on slices of 2-parameter persistence modules along lines, which are first studied and computed by (Lesnick & Wright, 2015) . In (Vipond, 2020) , the author generalizes the notion of 1-parameter persistence landscapes (Bubenik, 2015) . In this paper we propose a novel vector representation Generalized Rank Invariant Landscape (GRIL) for 2-parameter persistence modules which encodes richer information beyond fibered barcodes alone. The building blocks are based on the idea of generalized rank invariant (Kim & Mémoli, 2021; Dey et al., 2022) . The construction of GRIL is a generalization of persistence landscape (Bubenik, 2015; Vipond, 2020) . We will show that the vector representation GRIL is 1-Lipschitz and differentiable with respect to the filtration function f , which allows us to build a differentiable topological layer, PERSGRIL, in a machine learning pipeline. We demonstrate its use on synthetic datasets and standard graph datasetsfoot_0 . From the perspective of direct use of 2-parameter persistence modules into machine learning models, to the best of our knowledge, this is the first work of its kind. Persistent homology is a useful tool for characterizing the shape of data. Rooted in the theory of algebraic topology and algorithms, it has spawned the flourishing area of Topological Data Analysis(TDA). The classical persistent homology, also known as, 1-parameter persistence module, has attracted plenty of attention from both theory (Edelsbrunner & Harer, 2010; Oudot, 2015; Carlsson & Vejdemo-Johansson, 2021; Dey & Wang, 2022; Hofer et al., 2017; Li et al., 2022; Dey & Wang, 2022; Mémoli et al., 2022) and applications (Yan et al., 2021; Zhao et al., 2020; Yang et al., 2021b; a; Banerjee et al., 2020; Wu et al., 2020; Wang et al., 2020; Chen et al., 2021; Hu et al., 2021; Yan et al., 2022) . The standard pipeline of 1-parameter persistence module is as follows: Given a domain of interest X (e.g. a topological space, point cloud data, a graph, or a simplicial complex) with a scalar function f : X → R, one filters the domain X by the sublevel sets X α ≜ {x ∈ X | f (x) ≤ α} along with a continuously increasing threshold α ∈ R. The collection {X α }, which is called a filtration, forms an increasing sequence of subspaces ∅ = X -∞ ⊆ X α1 ⊆ • • • ⊆ X +∞ = X . Along with the filtration, topological features appear, persist, and disappear over some intervals. We consider phomology groups H p (-) (over a field, see (Hatcher, 2000) ) of the subspaces in this filtration, which results into a sequence of vector spaces. These vector spaces are connected by inclusion-induced linear maps forming an algebraic structure 0 (Hatcher, 2000) ). This algebraic structure, known as 1-parameter persistence module induced by f and denoted as M f , can be uniquely decomposed into a collection of atomic modules called interval modules, which completely characterizes the topological features in regard to the three behaviorsappearance, persistence, and disappearance of all p-dimensional cycles. This unique decomposition of 1-parameter persistence module is commonly summarized as a persistence diagram (Edelsbrunner et al., 2002) or barcode (Zomorodian & Carlsson, 2005) . Figure 1 (left) shows a filtration of a simplicial complex which induces a 1-parameter persistence module and its decomposition into bars. Some problems in practice may demand tracking the topological information in a filtration that is not necessarily linear. For example, in (Adcock et al., 2014) , 2-parameter persistence module is shown to be better for classifying hepatic lesions compared to 1-parameter persistence. In (Keller et al., 2018) , a virtual screening system based on 2-parameter persistence modules are shown to be effective for searching new candidate drugs. In such applications, instead of studying a sequential filtration filtered by a scalar function, one may study a grid-filtration induced by a R 2 -valued bifiltration function f : X → R 2 with R 2 equipped with partial order u ≤ v : u 1 ≤ v 1 , u 2 ≤ v 2 ; see Figure 1 (right) for an example of 2-parameter filtration. Following a similar pipeline as the 1-parameter persistence module, one will get a collection of vector spaces {M f u } u∈R 2 indexed by vectors u = (u 1 , u 2 ) ∈ R 2 and linear maps = H p (X -∞ ) → H p (X α1 ) → • • • → H p (X +∞ ). 1 2 3 4 ∞ H 0 H 1 6 5 K 1 K 2 K 3 K 4 K 5 K 6 {M f (u ≤ v) : M f u → M f v | u ≤ v ∈ R 2 } for all comparable u ≤ v. The entire structure M f , in analogy to the 1-parameter case, is called a 2-parameter persistence module induced from f . Unlike 1-parameter case, the algebraic structure of 2-parameter persistence modules is much more complicated. There is no complete discrete invariant like persistence diagrams or barcodes for 2-parameter persistence modules (Carlsson & Zomorodian, 2009) . A good non-complete invariant for 2-parameter persistence modules should characterize as many non-isomorphic topological features as possible. At the same time it should be stable with respect to small perturbations of filtration functions, which guarantees its important properties of continuity and differentiability for machine learning models. Therefore, how to build a good summary for 2-parameter persistence modules which is also applicable to machine learning models is an important problem. 2 2-PARAMETER PERSISTENCE LANDSCAPE From the perspective of representation learning, a persistence module can be viewed as a special representation of a discrete topological space, like point cloud data or graph embedding, which captures geometric and topological information. 1-parameter persistence module captures information about topological features that persist across different scales. Here, we consider a bi-filtration which leads to a 2-parameter persistence module. To better utilize the richer information captured by 2-



the code for full implementation will be available after review process is completed.



Figure 1: (left) 1-parameter filtration and bars; (right) a 2-parameter filtration inducing a 2-parameter persistence module whose decomposition is not shown.

