TOPOLOGICALLY FAITHFUL IMAGE SEGMENTATION VIA INDUCED MATCHING OF PERSISTENCE BARCODES

Abstract

Image segmentation is a largely researched field where neural networks find vast applications in many facets of technology. Some of the most popular approaches to train segmentation networks employ loss functions optimizing pixel-overlap, an objective that is insufficient for many segmentation tasks. In recent years, their limitations fueled a growing interest in topology-aware methods, which aim to recover the correct topology of the segmented structures. However, so far, none of the existing approaches achieve a spatially correct matching between the topological features of ground truth and prediction. In this work, we propose the first topologically and feature-wise accurate metric and loss function for supervised image segmentation, which we term TopoMatch. We show how induced matchings guarantee the spatially correct matching between barcodes in a segmentation setting. Furthermore, we propose an efficient algorithm to compute TopoMatch for images. We show that TopoMatch is an interpretable metric to evaluate the topological correctness of segmentations, which is more sensitive than the well-established Betti number error. Moreover, the differentiability of the TopoMatch loss enables its use as a loss function. It improves the topological performance of segmentation networks across six diverse datasets while preserving the volumetric performance.

1. INTRODUCTION

Topology studies properties of shapes that are related to their connectivity and that remain unchanged under deformations, translations, and twisting. Some topological concepts, such as cubical complexes, homology and Betti numbers, form interpretable descriptions of shapes in space that can be efficiently computed. Naturally, the topology of physical structures is highly relevant in machine learning tasks, where the preservation of its connectivity is crucial, a prominent example being image segmentation. Recently, a number of methods have been proposed to improve topology preservation in image segmentation for a wide range of applications. However, none of the existing concepts take the spatial location of the topological features (e.g. connected components or cycles) within their respective image into account. Evidently, spatial correspondence of these features is a critical property of segmentations, see Fig. 1 . (Hu et al. (2019) ). We match cycles between label and prediction for a CREMI image and denote matched pairs in the same color. We visualize only six (randomly selected out of the total 23 matches for both methods) matched pairs for presentation clarity. Note that TopoMatch always matches spatially correctly while the Wasserstein matching gets most matches wrong. Our contribution In this work, we introduce a rigorous framework for faithfully quantifying the preservation of topological properties in the context of image segmentation, see Fig. 2 . Our method builds on the concept of induced matchings between persistence barcodes from algebraic topology, introduced by Bauer & Lesnick (2015) . The introduction of these matching to a machine learning setting allows us to precisely formulate spatial correspondences between topological features of two grayscale images. We achieve this by embedding both images into a common comparison image. Put in simple terms, our central contribution is an efficient, differentiable solution for localized topological error finding, which serves as: • a topological loss to train segmentation networks, which guarantees to correctly, in a spatial sense, emphasize and penalize the topological structures during training (see Sec 3.2); • an interpretable topological quality metric for image segmentation, which is not only sensitive to the number of topological features but also to their location within the respective images (see Sec. 3.3) . Experimentally, our TopoMatch construction proves to be an effective loss function, leading to vastly improved topology across six diverse datasets.

1.1. RELATED WORK

Algebraic stability of persistence Several proofs for the stability of persistence have been proposed in the literature. In 2005 , Cohen-Steiner et al. (2005) established a first stability result for persistent homology of real-valued functions. The result states that the map sending a function to the barcode of its sublevel sets is 1-Lipschitz with respect to suitable metrics. In 2008 this result was generalized by Chazal et al. (2009b) and formulated in purely algebraic terms, in what is now known as the algebraic stability theorem. It states that the existence of a δ-interleaving (a sort of approximate isomorphism) between two pointwise finite-dimensional persistence modules implies the existence of a δ-matching between their respective barcodes. This theorem provides the justification for the use of persistent homology to study noisy data. In Bauer & Lesnick (2015) , the authors present a constructive proof of this theorem, which associates to a given δ-interleaving between persistence modules a specific δ-matching between their barcodes. For this purpose, they introduce the notion of induced matchings, which form the foundation of our proposed TopoMatch framework. Topology aware segmentation Multiple works have highlighted the importance of topologically correct segmentations in various computer vision applications. Persistent homology is a popular tool from algebraic topology to address this issue. A key publication by Hu et al. (2019) introduced the Wasserstein loss as a variation of the Wasserstein distance to improve image segmentation. They match points of dimension 1 in the persistence diagramsan alternative to barcodes as descriptor of persistent homolgy -of ground truth and prediction by minimizing the squared distance of matched points. However, this matching has a fundamental limitation, in that it cannot guarantee that the matched structures are spatially related in any sense (see Fig. 1 and App. A). Put succinctly, the cycles are matched irrespective of the location within the image, which frequently has an adverse impact during training (see App. F). Clough et al. (2020) follows a similar approach and train without knowing the explicit ground truth segmentation, but only the Betti numbers it ought to have. Persistent homology has also been used by Abousamra et al. (2021) for crowd localization and by Waibel et al. (2022) for reconstructing 3D cell shapes from 2D images. Other methods incorporate pixel-overlaps of topologically relevant structures. For example, the clDice score, introduced by Shit et al. (2021) , computes the harmonic mean of the overlap of the predicted skeleton with the ground truth volume and vice versa. Hu & Chen (2021) and Jain et al. (2010) use homotopy warping to identify critical pixels and measure the topological difference between grayscale images. Hu et al. (2021) utilizes discrete Morse theory (see Delgado-Friedrichs et al. (2014) ) to compare critical topological structures within prediction and ground truth. Wang et al. (2022) incorperate a marker loss, which is based on the Dice loss between a predicted marker map and the ground truth marker map, to improve gland segmentations topologically. Generally, these overlap-based approaches are computationally efficient but do not explicitly guarantee the spatial correspondence of the topological features. Other approaches aim at enforcing topologically motivated priors, for example, enforcing connectivity priors Sasaki et al. (2017) ; Wang & Jiang (2018) . Mosinska et al. (2018) applied task-specific pre-trained filters to improve connected components. Zhang & Lui (2022) uses template masks as an input to enforce the diffeomorphism type of a specific shape. Further work by Cheng et al. (2021) jointly models connectivity and features based on iterative feedback learning. Oner et al. (2020) aims to improve the topological performance by enforcing region separation of curvilinear structures.

2. BACKGROUND ON ALGEBRAIC TOPOLOGY

We introduce the necessary concepts from algebraic topology in order to describe the construction of induced matchings for grayscale images. For the basic definitions, we refer to the App. L.

2.1. GRAYSCALE IMAGES AS FILTERED CUBICAL COMPLEXES

The topology of grayscale images is best captured by filtered cubical complexes. In order to filter a cubical complex K we consider an order preserving function f : K → R. Its sublevel sets D(f ) r := f -1 ((-∞, r]) assemble to the sublevel filtration D(f ) = {D(f ) r } r∈R of K. Since f can only take finitely many values {f 1 < . . . < f l }, the filtered cubical complex K * given by K i = D(f ) fi for i = 1, . . . , l, already encodes all the information about the filtration. For a grayscale image I ∈ R m×n we consider the cubical grid complex K m,n consisting of all cubical cells contained in [1, m] × [1, n] ⊆ R 2 . Its filter function f I is defined on the vertices of K m,n by the corresponding entry in I, and on all higher-dimensional cubes as the maximum value of its vertices. Note that f I is order preserving, so we can associate the sublevel filtration of f I and its corresponding filtered cubical complex to the image I and denote them by D(I) and K * (I), respectively. This construction is called the V-construction since pixels are treated as vertices in the cubical complex, see Fig. 3b . An alternative, the T-construction, considers pixels as top-dimensional cells of a 2-dimensional cubical complex (see Heiss & Wagner (2017) ). We implemented both, V-and T-construction, in TopoMatch and encode them in the ValueMap array inside the CubicalPersistence class. Persistent homology considers filtrations of spaces and observes the lifetime of topological features within the filtration in form of persistence modules. The basic premise is that features that persist for a long time are significant, whereas features with a short lifetime are likely to be caused by noise.

2.2. PERSISTENT HOMOLOGY AND ITS BARCODE

The persistent homology H * (f ) of an order preserving function f : K → R consists of vector spaces H * (f ) r = H * (D(f ) r ) and transition maps H * (f ) r,s : H * (D(f ) r ) → H * (D(f ) s ) induced by the inclusions D(f ) r → D(f ) s for r ≤ s. Note that H * (f ) is a p.f.d persistence module . By a result of see Crawley-Boevey (2012), any p.f.d. persistence module is isomorphic to a direct sum of interval modules M ∼ = I∈B(M ) C(I). Here, B(M ) denotes the barcode of M , which is given by a multiset of intervals. For a grayscale image matrix I ∈ R m×n with associated filter function f I : K m,n → R, we will refer to the persistent homology of f I as the persistent homology of the image I and denote it by H * (I). Its associated barcode will be denoted by B(I). Note that the persistent homology is continuous from above: all intervals in the barcode are of the form [s, t). In order to compute the barcode B(I), we make use of the reduction algorithm described in Edelsbrunner et al. (2008) . It starts by sorting the cells of the associated filtered cubical complex K * (I) to obtain a compatible ordering c 1 , . . . , c l : the cells in K i preceed the cells in K \ K i , and the faces of a cell preceed the cell. This ordering induces a cell-wise refinement L * (I) of K * (I), which we encode in the IndexMap array inside the CubicalPersistence class. The algoritm then performs a variant of Gaussian elimination on the boundary matrix of K, with rows and columns indexed by the cells in the compatible ordering. Adding a d-cell c k to the complex will either create new homology in degree d or turn homology classes in degree d -1 trivial (see Figure 4 ). In the latter case, assuming that the class that becomes trivial when adding c k has been created by cell c j with j < k, we pair the cells c j and c k . This way we partition the set of cells into persistence pairs and singletons. Each pair (c j , c k ), satisfying f I (c j ) < f I (c k ), gives rise to a finite interval [f I (c j ), f I (c k )), and each singleton c i gives rise to an essential interval [f I (c i ), ∞) in the barcode of I. Note that a finite interval [f I (c j ), f I (c k )) determines a refined interval [j, k ), and we call the set B fine (I) of refined intervals the refined barcode of I. Alternatively, the refined barcode of I can be seen as barcode of the persistent homology of the refined filtration L * (I).

2.3. INDUCED MATCHINGS OF PERSISTENCE BARCODES

In order to give a constructive proof for the algebraic stability theorem of persistent homology, the authors of Bauer & Lesnick (2015) introduced the notion of induced matchings, which play a central role in our TopoMatch matching. The following theorem (paraphrased as a special case of the general Theorem 4.2 in Bauer & Lesnick (2015) ) is key to the definition of induced matchings: Theorem 1 Let Φ : M → N be a morphism of p.f.d., staggered persistence modules that are continuous from above. Then there are unique injective maps B(im Φ) → B(M ) and B(im Φ) → B(N ), which map an interval [b, c) ∈ B(im Φ) to an interval [b, d) ∈ B(M ) with c ≤ d, and to an interval [a, c) ∈ B(N ) with a ≤ b, respectively. Note that im Φ is a p.f.d. submodule of N , and we will refer to its barcode as the image barcode of Φ. Obviously, the injections in Theorem 1 determine matchings B(M ) σ M --→ B(im Φ) σ N --→ B(N ). The induced matching of Φ is then given by the composition σ(Φ) = σ N • σ M . Induced matchings of grayscale images Let I, J ∈ R m×n be matrices describing grayscale images, such that I ≥ J (entry-wise). Then the sublevel sets of I form subcomplexes of the sublevel sets of J and the corresponding inclusions D(I) r → D(J ) r are cubical maps. Hence, they induce maps H * (I) r → H * (J ) r in homology, which assemble to a persistence map Φ(I, J ) : H * (I) → H * (J ). We will denote the image barcode of Φ(I, J ) by B(I, J ). Considering the refined filtrations L * (I), L * (J ), we obtain staggered persistence modules resulting in refined barcodes B fine (I), B fine (J ). For the computation of the image barcode, we follow the algorithm described in Bauer & Schmahl (2022) . It involves the reduction of the boundary matrix of K m,n with rows indexed by the ordering c 1 , . . . , c l in L * (I) and columns indexed by the ordering d 1 , . . . , d l in L * (J ). A pair (c i , d j ) satisfying f I (c i ) < f J (d j ), obtained by the means of this reduction, then gives rise to an image persistence pair (c i , d j ), which corresponds to the finite interval [f I (c i ), f J (d j )) ∈ B(I, J ). By matching refined intervals with the image persistence pairs according to Theorem 1, we obtain a matching σ fine : B fine (I) → B fine (J ) between the refined barcodes, which determines the induced matching σ(I, J ) : B(I) → B(J ) by replacing refined intervals with the corresponding finite interval. In this work, we augment this induced matching by additionally considering reverse persistence pairs, i.e., pairs (c i , d j ) obtained by the reduction, with f I (c i ) ≥ f J (d j ) (see Figure 5d ). When this is the case, we also match the corresponding intervals in B fine (I) and B fine (J ). Note that this is a slight variation of the induced matching defined in Bauer & Lesnick (2015) . This extension satisfies similar properties and is a natural adaptation to our computational context.

3. TOPOMATCH -FROM ALGEBRAIC TOPOLOGY TO IMAGE SEGMENTATION

In general, the structure of interest in segmentation tasks is given by the foreground. Therefore, we consider superlevel filtrations instead of sublevel filtrations in applications. For simplicity, we stick to sublevel filtrations to describe the theoretical background. Throughout this section, we denote by L ∈ [0, 1] m×n a likelihood map predicted by a deep neural network, by P ∈ {0, 1} m×n the binarized prediction of L, and by G ∈ {0, 1} m×n the ground truth segmentation. g) and (h) show the induced matchings between individual barcodes (matchings indicated in grey), and (i) shows the resulting TopoMatch matching between B(L) and B(G), which matches a red interval to a blue interval if there is a green interval in between. We use this matching to define our loss and metric. In order to visualize that two objects in two different images are at the same location, we can simply move one image ontop of the other one and observe that the locations of the objects now agree. Thereby, we are constructing a common ambient space for both images which allows us to identify locations. Following this idea, in order to find a matching between B(L) and B(G) that takes the location of represented topological features into account, we are looking for a common ambient filtration of K m,n , which is  (G) → B(C) (see Sec. 2.3). The TopoMatch matching τ (L, G) : B(L) → B(G) is then given by the composition τ (L, G) = σ(G, C) -1 • σ(L, C). Working with superlevel sets yields an analogous construction. In the superlevel-setting we choose C = max(L, G) as the comparison image to guarantee that each superlevel set of the comparison image is the union of the corresponding superlevel sets of ground truth and likelihood map. Persistent homology is stable, see Chazal et al. (2009a) , i.e., there exist metrics on the set of persistence diagrams for which slight variations in the input result in small variations of the corresponding persistence diagram. Therefore, it is natural to require Dgm(L) to be similar to Dgm(G). A frequently used metric to measure the difference between persistence diagrams is the Wasserstein distance (see Cohen-Steiner et al. ( 2010)), and it has been adapted in Hu et al. (2019) to train segmentation networks. Because of the shortcomings described in Fig. 1 ,8c,19b and App. A,F, we propose to replace the Wasserstein matching γ * by the TopoMatch matching τ (L, G) and define the TopoMatch loss

3.2. TOPOMATCH DEFINES

L TM (L, G) = q∈Dgm(L) ∥q -τ (L, G)(q)∥ 2 2 . Since the values in L and G are contained in [0, 1], we replace the essential intervals [a, ∞) with the finite interval [a, 1], to obtain a well-defined expression. To efficiently train segmentation networks, we combine our TopoMatch loss with a standard volumetric loss, specifically, the Dice Loss, to L train = αL TM (L, G) + L dice (L, G). Gradient of TopoMatch loss Note that we can see L = L(I, ω) as a function that assigns the predicted likelihood map to an image I ∈ R m×n and the segmentation network parameters ω ∈ R l . A point q = (q 1 , q 2 ) ∈ Dgm(L) describes a topological feature that is born by adding pixel b(q) (birth of q) and killed by adding pixel d(q) (death of q) to the filtration. The coordinates of q are then determined by their values q 1 = L d(q) and q 2 = L b(q) . Assuming that the TopoMatch matching is constant in a sufficiently small neighborhood around the given predicted likelihood map L, the TopoMatch loss is differentiable in ω and the chain rule yields the gradient ∇ ω L TM (L, G) = q∈Dgm(L) 2(q 1 -τ (L, G)(q) 1 ) ∂L d(q) ∂ω + 2(q 2 -τ (L, G)(q) 2 ) ∂L b(q) ∂ω . Note that likelihood maps for which this assumption is not satisfied may exist. But this requires L to have at least two entries with the exact same value, and the set of such likelihood maps has Lebesgue measure zero. Therefore, the gradient is well-defined almost everywhere, and in the edge cases, we consider it as a sub-gradient, which still reduces the loss and has a positive effect on the topology of the segmentation. Physical meaning of the gradient To understand the effect of the TopoMatch gradient during training, consider the example in Fig. 7 . Let x, y ∈ Dgm(L) denote the points corresponding to the yellow and blue cycle in (c), respectively. (b) shows that x is matched and y is unmatched. Since, all points in Dgm(G) are of the form (0,1), TopoMatch maps x to (0, 1) and y to its closest point ( y1+y2 2 , y1+y2 2 ) on the diagonal ∆. Therefore, the gradient will enforce the segmentation network to move x closer to (0, 1) (i.e., decrease x 1 = L d(x) and increase x 2 = L b(x) ) and y closer to ( y1+y2 2 , y1+y2 2 ) (i.e., increase y 1 = L d(y) and decrease y 2 = L b(y) ). This results in an amplification of the local contrast between ⋆ and × of the yellow cycle and a reduction of the local contrast between ⋆ and × of the blue cycle, which improves the topological performance of the segmentation. Summarized, we can say that matched features get emphasized, and unmatched features get suppressed during training, which highlights the importance of finding a spatially correct matching (see App. F for further discussion).

3.3. TOPOMATCH LOSS AS A TOPOLOGICAL METRIC FOR IMAGE SEGMENTATION

Betti number error The Betti number error β err (see App. K) compares the topological complexity of the binarized prediction P and the ground truth G. However, it is limited as it only compares the number of topological features in both images, while ignoring their spatial correspondence (see Fig. 8 ). In terms of persistence diagrams, the Betti number error can be expressed by considering a maximal matching β : Dgm(P ) → Dgm(G), e.g., the Wasserstein matching (see App. F), and counting the number of unmatched points: β err (P , G) = # ker(β) + # coker(β). Here, for a matching σ we denote by ker(σ) the multiset of unmatched points in the domain of σ and by coker(σ) the multiset of unmatched points in the codomain of σ (see App. L.5).

TopoMatch error

The TopoMatch loss L TM (P , G) can be seen as a refinement of the Betti number error, which also takes the location of the features within their respective images into account (see Fig. 8 ). Since the entries of P and G take values in {0, 1}, the only point appearing in their persistence diagrams is (0, 1) and its multiplicity coincides with the number of features in the respective image. Observe that an unmatched point contributes with (0 -1 2 ) 2 + (1 -1 2 ) 2 = 1 2 to L TM (P , G ) and a matched pair of points contributes with 0. Hence, the TopoMatch loss takes values in 1 2 N 0 and is given by half the number of unmatched features in both P and G, i.e. 2013) have "blob-like" foreground structures. They contain very few dimension-1 features but every instance of a cell or building forms a dimension-0 feature. Training of the segmentation networks For implementation details, e.g., the training splits, please refer to App. I and J. We train all our models for a fixed, dataset-specific number of epochs and evaluate the final model on an unseen test set. We train all models on an Nvidia P8000 GPU using Adam optimizer. We run experiments on a range of alpha-parameters for clDice (Shit et Figure 9 : Qualitative Results on CREMI dataset (same models used as in Table 1 ). Topological errors are indicated by red circles. Our method leads to less topological errors in the segmentation.

Main Results

Our proposed TopoMatch loss improves the topological accuracy of the segmentations across all datasets (Table 1 ), irrespective of the choice of hyper-parameters (Table 3 ) compared to all baselines. We show superior scores for the topological metrics TopoMatch error (T.M.) and Betti number error (Betti) in both dimension-0 and dimension-1. Further, the volumetric metrics (Accuracy, Dice, and clDice) of the segmentations show equivalent, if not superior quantitative results for our method. Our method can be trained from scratch or used to refine pre-trained networks. Importantly, our method improves the topological correctness of curvilinear segmentation problems (Roads, CREMI), blob-segmentation problems (Buildings, Colon), and mixed problems (SynMnist, Elegans). We confidently attribute this to the theoretical guarantees of induced matchings, which hold for the foreground and the background classes in dim-0 and dim-1. For illustration, please consider the Roads and Buildings dataset; essentially, the topology of the background of the Buildings dataset is very similar to the foreground in Roads. I.e., the foreground of the Roads and the background of the Buildings dataset are interesting in dim-1, whereas the background of the roads and the foreground of the Buildings are interesting in dim-0. As our method can efficiently leverage the topology features of both foreground and background when we apply sub-and superlevelsetmatching and it is intuitive that our method prevails in both. It is of note that for some datasets, the method by Hu et al. (2019) is the best performing baseline and for some Shit et al. (2021) . Ablation experiments In order to study the effectiveness of the TopoMatch loss, we conduct various ablation experiments. First, we study the effect of the α parameter in our method, see Table 3 . We find that increasing α improves the topological metrics. For some datasets, e.g., synMnist, the Dice metric is compromised if α is chosen too big. Therefore, we conclude that α is a tunable and dataset-specific parameter. Ostensibly, the effect of the α parameter cannot be compared directly. Nonetheless, it appears that our method is more robust towards variation in α. Second, we study the effect of considering both the foreground and the background (bothlevel) versus solely the foreground (superlevel). We find that bothlevel is particularly useful if the background has a complex topology (e.g. Elegans), whereas superlevel shows a similar performance if the foreground has a more complex topology (e.g. CREMI), see Table 2 . Third, we test the effect of pre-training and training from scratch for TopoMatch and the method by Hu et al. (2019) . Table 6 shows that our method can be trained from scratch efficiently if not superiorly, whereas the baseline method struggles in that setting -especially on more complex datasets such as CREMI. We attribute this to the spatially correct matching of TopoMatch and its consequences on the gradient (see Sec. 3.2). Training-from-scratch means that there is a lot potential for false positives and false negatives in the Wasserstein matching (see App. F) since there are a lot noisy features when the network is still uncertain. For example for CREMI we found that the Wasserstein matching matches cycles incorrectly in more then 99 % of the cases, see appendix F.1. Moreover, we observe that TopoMatch optimizes the Wasserstein loss more efficiently (see App. F.2). We also experiment with adding a boundary to images in order to close loops that cross the image border, similar to Hu et al. (2019) , and term this relative TopoMatch. Table 4 shows a negligible effect on all metrics. For additional ablation and more metrics on the ablation studies, please refer to the App. H. The computational complexity of our matching is O(n 3 ), see App. D for details.

5. DISCUSSION

Concluding remarks In this paper, we propose a rigorous method called TopoMatch, which enables the faithful quantification of corresponding topological properties in image segmentation. Herein, our method is the first to guarantee the correct matching of persistence barcodes in image segmentation according to their spatial correspondence. We show that TopoMatch is efficient as an interpretable segmentation metric, which can be understood as a sharpened variant of the Betti error. Further, we show how our method can be used to train segmentation networks. Training networks using TopoMatch is stable and leads to improvements on all 6 datasets.We foresee vast application potential in challenging tasks such as road network, vascular network and Neuron instance segmentation. We are thus hopeful that our method's theory and experimentation will stimulate future research in this area. Limitations In the general setting of persistent homology of functions on arbitrary topological spaces, there are instances where maps of persistence modules cannot be written as matchings. This is somewhat analogous to the fact that in linear algebra, certain linear transformations cannot be diagonalized. We did not observe any such case in our specific segmentation setting. A theoretical investigation of this question will be the subject of future work. Further, we understand applicationspecific experimental limitations. Our method's computational complexity is beyond widely used loss functions such as BCE (see App. D); moreover, our current implementation is only available in 2D, whereas the theoretical guarantees trivially generalize to 3D.

6. REPRODUCIBILITY STATEMENT

To facilitate understanding of the theory section please refer to the basic definitions in the appendix. The core algorithm (TopoMatch) is available as a python script in the supplementary material and is printed in pseudocode in the appendix, section C. The training details are also described in the appendix, section I. Furthermore, all of our code and all of our experimentation, including baselines and hyperparameters, is available in a public anonymous Github repositoryfoot_0 .

7. ETHICS STATEMENT

We, the authors, declare that we strictly adhere to the ICLR Code of Ethics. Our method and experimentation are carried out on (partly modified) public datasets with no known ethical concern associations. Our studies do not involve human subjects and we do not foresee any conflicts of interest and sponsorship, discrimination/bias/fairness concerns, privacy and security issues, legal compliance, and research integrity issues. 2019)) for Elegans, Colon and Buildings label-prediction pairs. Here we match the connected components (dim-0). The matched components (according to the matching methods) are represented in the same color. We randomly sample 6 matched cycles in each pair. 2019)) for Roads label-prediction pairs. The matched cycles (according to the matching methods) are represented in the same color. We randomly sample 6 matched cycles (dim-1) in each pair. We observe that our method correctly matches the cycles in the first two rows. The third row represents an example early in Training. Here we observe that our method correctly matches some "finished" cycles but also provides a correct matching to the blue and green cycles which still have to be closed. Essentially, one can observe here that our TopoMatch leads to a correct loss. 

C DETAIL ALGORITHM

Below, we provide the pseudocode for an efficient realization of the TopoMatch matching. For the computation of the barcodes in dimension-0 we leveraged the Union-Find datastructure, which is very efficient at managing equivalence classes. Alexander duality allows us to use it in dimension-1, as well (see Garin et al. (2020) ). Moreover, it can also be used for the computation of the image barcodes in both dimensions. Note that we adapt the Union Find class to manage the birth of equivalence classes. We use clearing (as proposed in Bauer ( 2021)) by keeping track of criticaledges and columns-to-reduce, in order to reduce the amount of operations during the reductions (see sections 2.2, 2.3).

Algorithm 1: TopoMatch

Data: G, L Option: relative = F alse, f iltration = 'superlevel ′ Result: L 0 , L 1 , L 1 begin 2 if filtration='superlevel' then // Construction of comparison image 3 C ← max(G, L) 4 else 5 C ← min(G, L) 6 end 7 B(G), D G , V G , X G ← CubicalP ersistence(G, relative, f iltration, T rue); 8 B(L), D L , V L , X L ← CubicalP ersistence(L, relative, f iltration, T rue); 9 B(C), C C , V C , X C ← CubicalP ersistence(C, relative, f iltration, F alse); 10 B(G, C) ← ImageP ersistence(D G , X G , C C , X C ); 11 B(L, C) ← ImageP ersistence(D L , X L , C C , X C ); 12 σ(G, C) ← InducedM atching(B(G, C), B(G), B(C)); 13 σ(L, C) ← InducedM atching(B(L, C), B(L), B(C)); 14 τ (L, G) = ϕ; // Initialize matched refined intervals 15 U 0 , U 1 = B(G) 0 , B(G) 1 ; // Initialize unmatched refined intervals for ground truth 16 V 0 , V 1 = B(L) 0 , B(L) 1 ; // Initialize unmatched refined intervals for prediction 17 L 0 = L 1 = 0 ; // Initialize TopoMatch loss 18 for d ← 0 to 1 by 1 do // Loop over dimension-d 19 foreach m 0 ∈ σ(G, C) d do 20 foreach m 1 ∈ σ(L, C) d do 21 if m 0 [2] = m 1 [2] then // Check for same image persistence pair 22 Add ((m 0 [0], m 0 [2], m 1 [0])) to τ (L, G) d ; 23 Remove (m 0 [0]) from U d ; 24 Remove (m 1 [0]) from V d ; 25 Remove (m 1 ) from σ(L, C) d ; 26 p, q = m 0 [0], m 1 [0]; 27 I 0 , I 1 = V G (Index2Coord(p[0])), V G (Index2Coord(p[1])) ; // Map index to value 28 J 0 , J 1 = V L (Index2Coord(q[0])), V L (Index2Coord(q[1])) ; // Map index to value 29 L d = L d + (I 0 -J 0 ) 2 + (I 1 -J 1 ) 2 ; // Loss for matched intervals  I 0 , I 1 = V G (Index2Coord(p[0])), V G (Index2Coord(p[1])) ; // Map index to value 36 L d = L d + (I0-I1) 2 2 ; // Loss for unmatched intervals in ground truth 37 end 38 foreach p ∈ V d do 39 I 0 , I 1 = V L (Index2Coord(p[0])), V L (Index2Coord(p[1])) ; // Map index to value 40 L d = L d + (I0-I1) 2 2 ; // Loss for unmatched intervals in prediction Figure 18 : Plot of the empirical convergence curves of our TopoMatch loss for the CREMI, MNIST, and ELEGANS datasets. We plot the TopoMatch contribution in the training loss for a varying number of epochs, which is dependent on the dataset size. We show that TopoMatch loss efficiently converges for the different datasets. The absolute magnitude of the loss varies from dataset to dataset because TopoMatch is a real interpretable measure of dim-0 and dim-1 topological features in the training images. E.g. CREMI has a substantially higher number of features, especially cycles, than Elegans, therefore, the absolute magnitude of the loss is likely higher.

F WASSERSTEIN MATCHING

The pth Wasserstein distance is frequently used to measure the difference between persistence diagrams; it is given by d p (B 1 , B 2 ) = inf γ q∈Dgm(B1) ∥q -γ(q)∥ p 1,8c,19b and App. A) and can have a negative impact on the training of segmentation networks. To see this, we distinguish two cases for a fixed point q = (q 1 , q 2 ) ∈ Dgm(L): case 1: (false positive) q is matched but there is no spatially corresponding feature in G : Since q is matched to the point (0, 1) ∈ Dgm(G), the loss L W will be reduced by decreasing the value q 1 and increasing the value q 2 . Hence, the segmentation network will learn to increase the local contrast of the feature described by the q (see Sec. 3.2), but it should be decreased. case 2: (false negative) q is unmatched but there is a spatially corresponding feature in G: Since q is unmatched, the bijection γ * maps it to its closest point ((q 1 + q 2 )/2, (q 1 + q 2 )/2) on the diagonal ∆ and the loss L W will be reduced by increasing the value q 1 and decreasing the value q 2 . Hence, the segmentation network will learn to decrease the local contrast of the feature described by q (see Sec. 3.2), but it should be increased.

F.1 FREQUENCY OF INCORRECT WASSERSTEIN MATCHING

Next, we study how frequently these two cases occur. Assuming that the TopoMatch matching is correct, we evaluate the quality of the Wasserstein matching on the CREMI dataset. Therefor, we choose a segmentation model to obtain label-prediction pairs for every image in the CREMI dataset and compute both matchings. Among the 37243 matched intervals in the barcodes of the predictions by the Wasserstein matching, only 224 have been matched correctly, i.e. it achieves a precision of 0.6%.

F.2 WASSERSTEIN LOSS AS BETTI NUMBER ERROR

For a binarized output P and ground truth G, the Wasserstein loss and the Betti number error are closely related. A similar argumentation as in Sec. 3.3 for the TopoMatch loss shows that β err (P , G) = 2L W (P , G). A lower Betti number error of a model trained with our TopoMatch loss compared to a model trained with the Wasserstein loss asserts that the TopoMatch loss produces more faithful gradients during the training of segmentation networks. Note that, empirically, models trained with TopoMatch loss consistently outperform models trained with Wasserstein los with regard to the Betti number error (see Tables 1, 3 ). G PERSISTENCE DIAGRAM AND BARCODES H ADDITIONAL ABLATION EXPERIMENTS 

J NETWORK SPECIFICATIONS

We use the following notation: 1. In(input channels), Out(output channels), BI(output channels) present input, output, and bottleneck information (for U-Net); 2. C(f ilter size, output channels) denote a convolutional layer followed by ReLU and batch-normalization; 3. U(f ilter size, output channels) denote a trans-posed convolutional layer followed by ReLU and batch-normalization; 4. ↓ 2 denotes maxpooling; 5. ⊕ indicates concatenation of information from an encoder block.

J.1 UNET CONFIGURATION-I

We use this configuration for CREMI, synthMNIST, Colon and Elegans dataset. This is a lightweight U-net which has sufficient expressive power for these datasets. A d-dimensional (cubical) complex in R n is a set of cubical cells in R n with maximal dimension d that is closed under the face relation, i.e., if d ∈ K and c is a face of d, then c ∈ K. Furthermore we call a cubical complex K ′ ⊆ K a subcomplex of K. A filtration of a cubical complex K is given by a family (K r ) r∈R of subcomplexes of K, which satisfies: (1) K r ⊆ K s for all r ≤ s, (2) K = K r for some r ∈ R. A filtered (cubical) complex K * is a cubical complex K together with a nested sequence of subcomplexes, i.e., a sequence of complexes ∅ = K 0 ⊆ K 1 . . . ⊆ K m = K. A function f : K → R on a cubical complex is said to be order preserving if f (c) ≤ f (d) for a face c of a cell d. A map f : K → K ′ between cubical complexes is said to be cubical if it respects the face relation, i.e., f (c) must be a face of f (d) in K ′ if c is a face of d in K.

L.3 HOMOLOGY OF CUBICAL COMOPLEXES

Homology is a powerful concept involving local computations to capture information about the global structure of topological spaces. It assigns a sequence of abelian groups to a space which encode its topological features in all dimensions. A feature in dimension-0 describes a connected component, in dimension-1, it describes a loop, and in dimension-2, it describes a cavity. It also relates these features between spaces by inducing homomorphisms between their respective homology groups. We briefly introduce the homology of cubical complexes with coefficients in F 2 . For more details, we refer to Kaczynski et al. (2004) . 



https://anonymous.4open.science/r/TopoMatch-ED20/README.md Procedure CubicalPersistence(I, relative, filtration, critical) github/anonymous



Figure 1: Motivation -comparison of our TopoMatch and Wasserstein matching (Hu et al. (2019)).We match cycles between label and prediction for a CREMI image and denote matched pairs in the same color. We visualize only six (randomly selected out of the total 23 matches for both methods) matched pairs for presentation clarity. Note that TopoMatch always matches spatially correctly while the Wasserstein matching gets most matches wrong. Our contribution In this work, we introduce a rigorous framework for faithfully quantifying the preservation of topological properties in the context of image segmentation, see Fig.2. Our method

Figure 2: (a) and (c) show two predictions for ground truth (b). Volumetric metrics, e.g., Dice favor (a) over (c), and even Betti number error can not differentiate between (a) and (c) while only TopoMatch detects the spatial error in (a) and favors (c).

Figure 3: (a) shows an image and (b) visualizes the V-construction.

Figure 4: A filtered cubical complex with varying homology in degree 1. Adding the green 1-cell in (b) creates homology and adding the red 2-cell in (c) turns homology trivial. Together they form a persistence pair.

Figure 5: (a), (c) and (e) show images which satisfy I ≥ J 1 , J 2 . (b) and (d) visualize the induced matchings. Red bars correspond to the barcode of I, green bars to the barcodes of J 1 , J 2 and grey bars to the image barcodes B(I, J 1 ), B(I, J 2 ). The shaded gray area highlights matched intervals according to their endpoints.

Figure 6: An exemplary construction of the TopoMatch matching. (a)-(f) show a likelihood map L, a ground truth G, the comparison image C and their barcodes. (g) and (h) show the induced matchings between individual barcodes (matchings indicated in grey), and (i) shows the resulting TopoMatch matching between B(L) and B(G), which matches a red interval to a blue interval if there is a green interval in between. We use this matching to define our loss and metric.

(a) big enough to contain the sublevel filtrations of L and G; (b) fine enough to capture the topologies of L and G. Here, (a) guarantees that we can compute induced matchings of the respective inclusions and (b) guarantees that the identification of features by the induced matchings are non-trivial (discriminative). The most natural candidate which comes into mind is given by the union D(L) r ∪ D(G) r of sublevel sets. Therefore, we introduce the comparison image C = min(L, G) (entry-wise minimum) and observe that D(C) r = D(L) r ∪ D(G) r . By construction, we have C ≤ L, G and obtain induced matchings σ(L, C) : B(L) → B(C) and σ(G, C) : B

A TOPOLOGICAL LOSS FUNCTION FOR IMAGE SEGMENTATION We denote by R the extended real line R ∪ {-∞, ∞}. A barcode B consisting of intervals [a, b) can then equivalently be seen as a multiset Dgm(B) of points (a, b) ∈ R 2 which lie above the diagonal ∆ = {(x, x) | x ∈ R}. Furthermore, we add all the points on the diagonal ∆ with infinite multiplicity to Dgm(B) and thus define the persistence diagram of B. A matching σ : B 1 → B 2 between barcodes then corresponds to a bijection σ : Dgm(B 1 ) → Dgm(B 2 ) between persistence diagrams, by mapping unmatched points (a, b) to their closest point ((a + b)/2, (a + b)/2) on the diagonal ∆. We use these perspectives interchangeably (see Fig. 20). For simplicity, we denote by Dgm(I) the persistence diagram associated to the barcode of a grayscale image I.

Figure 7: (a) L shows a Topological error (bottom right). (b) Matched cycles in Topo-Match are shown in yellow. (c) For both cycles in L, the birth (b(q)) and death pixels (d(q)) are marked with ⋆ and ×, respectively.

TM (P , G) = 1 2 # ker(τ (P , G)) + # coker(τ (P , G)) .

Figure 8: Illustration of the advantages of our TopoMatch error over the Betti number error. (a) shows a prediction P (left), ground truth G (right) and the corresponding Betti number error. (b)shows the TopoMatch matching in dim-1 (no features are matched) with its corresponding loss and (c) shows the Wasserstein matching in dim-1 (same color indicates a matching) with its corresponding loss. Note that both Betti number error and Wasserstein loss fail to represent the spatial error, while TopoMatch correctly does not match any cycles resulting in a loss of 2.4 EXPERIMENTATIONDatasets We employ a set of six datasets with diverse topological features for our validation experimentation. Two datasets, the Massachusetts roads dataset, and the CREMI neuron segmentation dataset, exhibit frequently connected curvilinear, network-like structures, which form a large number of cycles in the foreground. The C.elegans infection live/dead image dataset (Elegans) from the Broad Bioimage Benchmark CollectionLjosa et al. (2012) and our synthetic, modified MNIST dataset LeCun (1998) (synMnist) consist of a balanced number of dimension-0 and dimension-1 features. And third, the colon cancer cell dataset (Colon) from the Broad Bioimage Benchmark CollectionCarpenter et al. (2006);Ljosa et al. (2012) and the Massachusetts buildings dataset (Buildings)Mnih (2013) have "blob-like" foreground structures. They contain very few dimension-1 features but every instance of a cell or building forms a dimension-0 feature.

Figure10: Topological matchings. Illustration of the advantages of our TopoMatch algorithm over the existing Wasserstein matching and the Betti number error for two exemplary segmentations. On the left side, we depict a prediction-label pair for an image. On the right side, we depict the matched representative cycles in the same color for the Wasserstein matching (bottom row) and TopoMatch matching (top row). Our TopoMatch matches the spatially correct features and will penalize the correct features in the loss. Here, the Wasserstein matching mismatches the correctly predicted feature with the erroneously predicted feature, leading to a false loss for the wrongly segmented cycle.TopoMatchWasserstein Matching

Figure 12: Motivation. Our TopoMatch (induced matching) and the Wasserstein matching (Hu et al. (2019)) for Roads label-prediction pairs. The matched cycles (according to the matching methods) are represented in the same color. We randomly sample 6 matched cycles (dim-1) in each pair. We observe that our method correctly matches the cycles in the first two rows. The third row represents an example early in Training. Here we observe that our method correctly matches some "finished" cycles but also provides a correct matching to the blue and green cycles which still have to be closed. Essentially, one can observe here that our TopoMatch leads to a correct loss.

Figure 13: Motivation. Our TopoMatch (induced matching) and the Wasserstein matching (Hu et al. (2019)) for CREMI label-prediction pairs. The matched cycles (according to the matching methods) are represented in the same color. We randomly sample 6 matched cycles in each pair.

0 + L 1 ; // Total TopoMatch loss 44 end if (e, b) is valid then // Check for positive interval 80 Add (e, b) toB(I) 0 ; I) 0 , B(I) 1 ), C, V , X ; // Return refined barcodes, columns-to-reduce, Valuemap & Indexmap D COMPUTATIONAL COMPLEXITY For a grayscale image represented by a matrix I ∈ R M,N , we have n = M N number of pixels and form a cubical grid complex of dimension d = 2. The computation of the filtration and the boundary matrix can be done efficiently using the CubeMap data structure (see Wagner et al. (2012)) with O(3 d n + d 2 n) time and O(d 2 n) space complexity. Computing the barcodes by means of the reduction algorithm requires cubic complexity in the number of pixels O(n 3 ) (see Otter et al. (2017)). Despite our empirical acceleration due to the Union-Find class and clearing tricks (as described in Bauer & Schmahl (2022); Bauer (2021)), the order complexity remains O(n 3 ). We need O(n 2 ) time complexity for computing the final matching and loss. It is noteworthy that Hu et al. (2019) also needs O(n 3 ) time complexity to compute the barcode and O(n 2 ) for the matching, whereas Shit et al. (2021) requires relatively lower complexity O(n) due to the overlap based loss formulation.

Figure 19: (a) A predicted likelihood map L and ground truth segmentation G. (b) visualizes the Wasserstein matching γ * (only the yellow cycles are matched), i.e. the top-left cycle in L is a false negative and the bottom-right cycle in L is a false positive.

Figure 20: Illustrations of how to translate a matching between barcodes (a) into a bijection between persistence diagram (b) and vice versa. A red or blue line in (a) is a dot of the same color in (b). In (a), a green interval in between a blue and a red line indicates they are matched. In (b), a line connecting two points indicates that they are matched. For detail, please refer to Section 3.2.

ConvBlock : C B (3, out size) ≡ C(3, out size) → C(3, out size) →↓ 2 UpConvBlock: U B (3, out size) ≡ U (3, out size) → ⊕ → C(3, out size)Encoder : IN (1/3 ch) → C B (3, 16) → C B (3, 32) → C B (3, 64) → C B (3, 128) → C B (3, 256) → B(256) L BASIC DEFINITIONS AND TERMINOLOGY L.1 CUBICAL COMPLEXES A d-dimensional (cubical) cell in R n is the Cartesian product c = n j=1 I j of intervals I j = [a j , b j ] with a j ∈ Z, b j ∈ {a j ,a j + 1} and d ∈ {0, . . . , n} is the number of non-degenerate intervals among {I 1 , . . . , I d }. If c and d are cells and c ⊆ d, we call c a face of d of codimension dim(d) -dim(c). A face of codimension one is also called a facet.

HOMOLOGYA chain complex C * consists of a family {C d } d∈Z of vector spaces and a family of linear maps{∂ d : C d → C d-1 } d∈Z that satisfy ∂ d-1 • ∂ d = 0.

Figure 21: (a) and (b) show cells and their boundary (red). (c) and (d) visualize two homologous 1-cycles (blue) in a cubical complex. For d ∈ Z, we denote by K d the set of d-dimensional cells in a cubical complex K. The F 2 -vector space C d (K) freely generated by K d is the chain group of K in degree d. We can think of the elements in C d (K) as sets of d-dimensional cells and call them chains. These chain groups are connected by linear boundary maps ∂ d : C d (K) → C d-1 (K), which map a cell to the sum of its faces of codimension 1 and are extended linearly to all of C d (K). The cubical chain complex C * (K) is given by the pair ({C d (K)} d∈Z , {∂ d } d∈Z ). We denote by Z d (K) = ker ∂ d the subspace of cycles and by B d (K) = im ∂ d+1 the subspace of boundaries in C d (K). Since ∂ d-1 • ∂ d = 0, every boundary is a cycle and the homology group of K in degree d is defined by the quotient space H d (K) := Z d (K)/B d (K). In other words, H d (K) consists of equivalence classes of d-cycles and two d-cycles

Main results for TopoMatch and three baselines on six datasets. Green columns indicate the topological metrics. Bold numbers highlight the best performance for a given dataset if it is significant (i.e. the second best performance is not within std/8). We find that TopoMatch improves the segmentations in all topological metrics for all datasets. We further observe a constantly high performance in volumetric metrics. ↑ indicates higher value wins and ↓ the opposite.



bothlevel versus superlevel matching of our method on the Elegans dataset and the CREMI dataset. The bothlevel matching appears to have a more pronounced contribution in the scenario of topologically complex background

α ablation on the synMnist dataset and the Roads dataset

Relative Frame ablation of our method on the Roads dataset

dimension-1 and dimensions-0,1 matching ablation for the Hu et al. method on the Roads dataset

Pretraining vs training from scratch of ours and the Hu et al. method on the Elegans datasetThe full training routine with the complete trainingsets and testsets will be available with our github repository 2 . All our trainings are done on patches of 48 × 48 pixels. For the buildings datasetMnih (2013), we downsample the images to 375×375 pixels and randomly choose 80 samples for training and 20 for testing. For each epoch, we randomly sample 8 patches from each sample. For the Colon datasetCarpenter et al. (2006);Ljosa et al. (2012), we downsample the images to 256 × 256 pixels; we randomly choose 20 samples for training and 4 for testing. For each epoch, we randomly sample 12 patches from each sample. For the CREMI datasetFunke et al. (2019), we downsample the images to 312 × 312 pixels; we choose 100 samples for training and 25 for testing. For each epoch, we randomly sample 4 patches from each sample. For the Elegans datasetLjosa et al. (2012), we crop the images to 96 × 96 pixels; we randomly choose 80 samples for training and 20 for testing. For each epoch, we randomly sample 1 patch from each sample. For the synMnist dataset LeCun (1998), we synthetically modify the MNIST dataset to an image size of 48 × 48 pixels; please see our GitHub repository for details; we train on 4500 full, randomly chosen images and use 1500 for testing. For the Roads datasetMnih (2013), we downsample the images to 375 × 375 pixels; we randomly choose 100 samples for training and 24 for testing. For each epoch, we randomly sample 8 patches from each sample.

annex

We had to choose a different U-Net architecture for the road and building dataset because we realized that a larger model is needed to learn useful features for this complex task. 1024)

K EVALUATION METRICS

We evaluate our experiments using a set of topological and pixel-based metrics. The metrics are computed with respect to the binarized predictions. Here, TopoMatch constitutes the most meaningful quantification, see section 3.3. We calculate the TopoMatch metric for dimension-0 (T.M.-0) and dimension-1 (T.M.-1) as well as their sum (T.M.). Furthermore, we implement the Betti number error for dimension-0 (Betti 0), dimension-1 (Betti 1), and their sum (Betti):It computes the Betti numbers of both foregrounds and sums up their absolute difference in each dimension, i.e. it compares the topological complexity of the foregrounds. It is important to consider the dimensions separately since they have different relevance on different datasets. E.g., Roads has many 1-cycles, whereas Buildings has many 0-cycles (connected components).Additionally, we use the traditional Dice metric and Accuracy, which describe the in total correctly classified pixels, as well as the clDice metric from Shit et al. (2021) . Here, we calculate the clDice between the volumes and the skeleta, extracted using the skeletonize function of the skimage python-library. We compute all metrics on the individual test images of their respective size (without patching) and take the mean across the whole testset.z 1 , z 2 are equivalent (homologous) if their difference is a boundary. For convenience, we define H * (K) = d∈Z H d (K). Note that the homology groups still carry the structure of a F 2 -vector space and their dimensionHomology does not only act on spaces; it also acts on maps between spaces. Therefor, a cubical map f : K → K ′ induces a linear map C * (f ) : C * (K) → C * (K ′ ), by mapping a cell c ∈ K with dim(f (c)) = dim(c) to f (c) and extending this assignment linearly to all of C * (K). Then C * (f ) descends to a linear map H * (f ) :

L.4 PERSISTENCE MODULES

A persistence module M consists of a family {M r } r∈R of vector spaces, which are connected by linear transition maps maps M r,s : M r → M s for all r ≤ s, such thatM is said to be pointwise finite-dimensional (p.f.d.) if M r is finite-dimensional for every r ∈ R.A basic example of a persistence module is an interval module C(I) for a given interval I ⊆ R. It consists of vector spacesA morphism Φ : M → N between persistence modules is a family {Φ r : M r → N r } r∈R of linear maps, such that for all r ≤ s the following diagram commutes:

Φr Φs

Nr,sWe call Φ an isomorphism (resp. monomorphism, epimorphism) of persistence modules if Φ r is a isomorphism (resp. monomorphism, epimorphism) of vector spaces for all r ∈ R.For a family {M i } i∈I of persistence modules, the direct sum i∈I M i is the persistence module consisting of vector spaces ( i∈I M i ) r = i∈I (M i ) r for all r ∈ R and transition mapsA multiset X consists of a set |X| together with a multiplicity function mult X : |X| → N ∪ {∞}. Equivalently it can be represented by its underlying set ⨿X = x∈|X| mult X (x) i=1 {x}. We say X is finite if its underlying set ⨿X is finite and its cardinality #X is given by the cardinality of its underlying set.Let K * be a filtered cubical complex and L * a cell-wise refinement according to the compatible ordering c 1 , . . . , c l of the cells in K. The boundary matrix B ∈ F l×l 2 of L * is given entry-wise by

L.5 MATCHINGS

A map f : X → Y between multisets is a map f : ⨿ X → ⨿Y between their underlying sets.A matching σ : X → Y between multisets is a bijection σ : X ′ → Y ′ for some multisets X ′ , Y ′ that satisfy ⨿X ′ ⊆ ⨿X and ⨿Y ′ ⊆ ⨿Y . We call• coim(σ) = X ′ the coimage of σ,• im(σ) = Y ′ the image of σ,• ker(σ) = X \ X ′ the kernel and of σ,• coker(σ) = Y \ Y ′ the cokernel of σ.For a morphism Φ : M → N of persistence modules, the image of Φ is the persistence module im(Φ), with im(Φ) r = im(Φ r ) and transition maps im(Φ) r,s = N r,s | im(Φr) : im(Φ r ) → im(Φ s ) for r, s ∈ R.Let M, N be persistence modules. We call M a (persistence) submodule of N if M r is a subspace of N r for every r ∈ R and the inclusions i r : M r → N r assemble to a persistence map i = (i r ) r∈R .In this case we write M ⊆ N .The composition of two matchings X σ1 -→ Y σ2 -→ Z is given by the composition of the bijectionswith Y ′ = ⨿ im(σ 1 ) ∩ ⨿ coim(σ 2 ).A persistence module M is said to be staggered if every real number r ∈ R occurs at most once as endpoint of an interval in B(M ).

