FAIR ATTRIBUTE COMPLETION ON GRAPH WITH MISSING ATTRIBUTES

Abstract

Tackling unfairness in graph learning models is a challenging task, as the unfairness issues on graphs involve both attributes and topological structures. Existing work on fair graph learning simply assumes that attributes of all nodes are available for model training and then makes fair predictions. In practice, however, the attributes of some nodes might not be accessible due to missing data or privacy concerns, which makes fair graph learning even more challenging. In this paper, we propose FairAC, a fair attribute completion method, to complement missing information and learn fair node embeddings for graphs with missing attributes. FairAC adopts an attention mechanism to deal with the attribute missing problem and meanwhile, it mitigates two types of unfairness, i.e., feature unfairness from attributes and topological unfairness due to attribute completion. FairAC can work on various types of homogeneous graphs and generate fair embeddings for them and thus can be applied to most downstream tasks to improve their fairness performance. To our best knowledge, FairAC is the first method that jointly addresses the graph attribution completion and graph unfairness problems. Experimental results on benchmark datasets show that our method achieves better fairness performance with less sacrifice in accuracy, compared with the state-of-the-art methods of fair graph learning.

1. INTRODUCTION

Graphs, such as social networks, biomedical networks, and traffic networks, are commonly observed in many real-world applications. A lot of graph-based machine learning methods have been proposed in the past decades, and they have shown promising performance in tasks like node similarity measurement, node classification, graph regression, and community detection. In recent years, graph neural networks (GNNs) have been actively studied (Scarselli et al., 2008; Wu et al., 2020; Jiang et al., 2019; 2020; Zhu et al., 2021c; b; a; Hua et al., 2020; Chu et al., 2021) , which can model graphs with high-dimensional attributes in the non-Euclidean space and have achieved great success in many areas such as recommender systems (Sheu et al., 2021) . However, it has been observed that many graphs are biased, and thus GNNs trained on the biased graphs may be unfair with respect to certain sensitive attributes such as demographic groups. For example, in a social network, if the users with the same gender have more active connections, the GNNs tend to pay more attention to such gender information and lead to gender bias by recommending more friends to a user with the same gender identity while ignoring other attributes like interests. And from the data privacy perspective, it is possible to infer one's sensitive information from the results given by GNNs (Sun et al., 2018) . In a time when GNNs are widely deployed in the real world, this severe unfairness is unacceptable. Thus, fairness in graph learning emerges and becomes notable very recently. Existing work on fair graph learning mainly focuses on the pre-processing, in-processing, and postprocessing steps in the graph learning pipeline in order to mitigate the unfairness issues. The preprocessing approaches modify the original data to conceal sensitive attributes. Fairwalk (Rahman et al., 2019 ) is a representative pre-processing method, which enforces each group of neighboring nodes an equal chance to be chosen in the sampling process. In many in-processing methods, the most popular way is to add a sensitive discriminator as a constraint, in order to filter out sensitive information from original data. For example, FairGNN (Dai & Wang, 2021) adopts a sensitive

availability

https://github.com/donglgcn/FairAC.

