GRAPH CONVOLUTIONAL NORMALIZING FLOWS FOR SEMI-SUPERVISED CLASSIFICATION & CLUSTERING Anonymous

Abstract

Graph neural networks (GNNs) are discriminative models that directly model the class posterior p(y|x) for semi-supervised classification of graph data. While being effective for prediction, as a representation learning approach, the node representations extracted from a GNN often miss useful information for effective clustering, because that is not necessary for a good classification. In this work, we replace a GNN layer by a combination of graph convolutions and normalizing flows under a Gaussian mixture representation space, which allows us to build a generative model that models both the class conditional likelihood p(x|y) and the class prior p(y). The resulting neural network, GC-Flow, enjoys two benefits: it not only maintains the predictive power because of the retention of graph convolutions, but also produces well-separated clusters in the representation space, due to the structuring of the representation as a mixture of Gaussians. We demonstrate these benefits on a variety of benchmark data sets. Moreover, we show that additional parameterization, such as that on the adjacency matrix used for graph convolutions, yields additional improvement in clustering.

1. INTRODUCTION

Semi-supervised learning (Zhu, 2008) refers to the learning of a classification model by using typically a small amount of labeled data with possibly a large amount of unlabeled data. The presence of the unlabeled data, together with additional assumptions (such as the manifold and smoothness assumptions), may significantly improve the accuracy of a classifier learned even with few labeled data. A typical example of such a model in the recent literature is the graph convolutional network (GCN) of Kipf & Welling (2017) , which capitalizes on the graph structure (considered as an extension of a discretized manifold) underlying data to achieve effective classification. GCN, together with other pioneering work on parameterized models, have formed a flourishing literature of graph neural networks (GNNs), which excel at node classification (Zhou et al., 2020; Wu et al., 2021) . However, driven by the classification task, GCN and other GNNs may not produce node representations with useful information for goals different from classification. For example, the representations do not cluster well in some cases. Such a phenomenon is of no surprise. For instance, when one treats the penultimate activations as the data representations and uses the last dense layer as a linear classifier, the representations need only be close to linearly separable for an accurate classification; they do not necessarily form well-separated clusters. This observation leads to a natural question: can one build a representation model for graphs that not only is effective for classification but also unravels the inherent structure of data for clustering? The answer is affirmative. One idea is to, rather than construct a discriminative model p(y|x) as all GNNs do, build a generative model p(x|y)p(y) whose class conditional likelihood is defined by explicitly modeling the representation space, for example by using a mixture of well-separated unimodal distributions. Indeed, the recently proposed FlowGMM model (Izmailov et al., 2020) uses a normalizing flow to map the distribution of input features to a Gaussian mixture, resulting in wellstructured clusters. This model, however, is not designed for graphs and it underperforms GNNs that leverage the graph structure for classification. In this work, we present graph convolutional normalizing flows (GC-Flows), a generative model that not only classifies well, but also yields node representations that capture the inherent structure of data, as a result forming high-quality clusters. We can relate GC-Flows to both GCNs and

