LOGICAL VIEW ON FAIRNESS OF A BINARY CLASSIFICATION TASK

Abstract

Ethical, Interpretable/Explainable, and Responsible AI are an active area of research and important social initiative. Vendors offer solutions. For instance, Microsoft compiled a platform, Responsible AI. Within the context, challenges of algorithmic fairness and trustworthiness of machine learning are paramount. Furthermore, several authors argue that the emergence of algorithmically infused societies necessitates innovative approaches to measuring feasible information, e.g., collecting data shall follow a trustworthy social theory. In this paper, we show that this approach is heuristic at best. We prove that, with no regards to data, fairness and trustworthiness are algorithmically undecidable for a basic machine learning task, the binary classification. Therefore, even the approach based on not only improving but fully solving the three usually assumed issues -the insufficient quality of measurements, the complex consequences of (mis)measurements, and the limits of existing social theories -is only heuristics. We show that, effectively, the fairness of a classifier is not even a (version of bias-variance) trade-off inasmuch as it is a logical phenomenon. Namely, we reveal a language L and an L-theory T for binary classification task such that the very notion of loss is not expressible in the first-order logic formula in L.

1. INTRODUCTION

Ethical, Interpretable/Explainable, and Responsible AI are an active area of research and important social initiative. Vendors offer solutions. For instance, Microsoft compiled a platform, Responsible AI. Within the context, challenges of algorithmic fairness and trustworthiness of machine learning are paramount. Furthermore, several authors argue that the emergence of algorithmically infused societies necessitates innovative approaches to measuring feasible information, e.g., collecting data shall follow a trustworthy social theory [3] . Difficulties, associated with such approach, can be found in [7] . Moreover, in this paper, we show that this approach is heuristic at best. We prove that, with no regards to data, fairness and trustworthiness are algorithmically undecidable for a binary classification task (cf. [4], [5] ). Therefore, even the approach based on not only improving but fully solving the three usually assumed issues -the insufficient quality of measurements, the complex consequences of (mis)measurements, and the limits of existing social theories -is only heuristic. We prove that, effectively, the fairness of a binary classifier is not even a trade-off (e.g., a version of bias-variance/complexity etc.) inasmuch as it is a logical phenomenon. Namely, we reveal a language L and L-theory T for binary classification task such that the very notion of loss is not expressible in the first-order logic L-formula. Note that the essence of a "mass view" approach is that unlike in a traditional machine learning context, we are not making any assumptions on nature of a classifier loss, other than it should provide a way to compare two (potentially different) classifiers. Under this very broad perspective, it turns out that, in a natural model, the loss of a classifier is inexpressible as a first-order logic expression (cf. Appendix for the definitions). Without loss of generality, it follows that any feasible definition of fairness for machine learning classification task is undecidable. Indeed, one has to assume that two classifiers have to be comparable in their performance characteristics in the first place. If the latter is not expressible, then one cannot achieve a sensible conclusion on fairness. By the same token, since all derived heuristics such as transparency, interpretability and trust, must include a notion of fairness,

