PROVABLY AUDITING ORDINARY LEAST SQUARES IN LOW DIMENSIONS

Abstract

Auditing the stability of a machine learning model to small changes in the training procedure is critical for engendering trust in practical applications. For example, a model should not be overly sensitive to removing a small fraction of its training data. However, algorithmically validating this property seems computationally challenging, even for the simplest of models: Ordinary Least Squares (OLS) linear regression. Concretely, recent work defines the stability of a regression as the minimum number of samples that need to be removed so that rerunning the analysis overturns the conclusion (Broderick et al., 2020), specifically meaning that the sign of a particular coefficient of the OLS regressor changes. But the only known approach for estimating this metric, besides the obvious exponentialtime algorithm, is a greedy heuristic that may produce severe overestimates and therefore cannot certify stability. We show that stability can be efficiently certified in the low-dimensional regime: when the number of covariates is a constant but the number of samples is large, there are polynomial-time algorithms for estimating (a fractional version of) stability, with provable approximation guarantees. Applying our algorithms to the Boston Housing dataset, we exhibit regression analyses where our estimator outperforms the greedy heuristic, and can successfully certify stability even in the regime where a constant fraction of the samples are dropped.

1. INTRODUCTION

A key facet of interpretability of machine learning models is understanding how different subsets of the training data influence the learned model and its predictions. Computing the influences of individual training points has been shown to be a useful tool for enhancing trust in the model (Zhou et al., 2019) , for tracing the origins of model bias (Brunet et al., 2019) , and for identifying mislabelled training data and other model debugging (Koh & Liang, 2017) . Modelling the influence of groups of training points has applications to measuring fairness (Chen et al., 2018) , vulnerability to contamination of multi-source training data (Hayes & Ohrimenko, 2018) , and (most relevant to this paper) identification of unstable predictions (Ilyas et al., 2022) and models (Broderick et al., 2020) . In a high-stakes machine learning application, it would likely be alarming if some data points were so influential that the removal of, say, 1% of the training data dramatically changed the model. An ideal, trustworthy machine learning pipeline therefore should include validation that this does not happen. But the obvious algorithm for checking if a model trained on n data points exhibits this instability would require computing the group influences of n n/100 different subsets of the data, which is computationally infeasible even for fairly small n. Instead, current methods for estimating the stability of a model simply use the first-order approximation of group influence: namely, the sum of individual influences of data points in the group. With this approximation, vulnerability of a model to dropping αn data points is heuristically estimated by dropping the αn most individually influential data points (Broderick et al., 2020; Ilyas et al., 2022) . This heuristic can be thought of as using "local" stability as a proxy for "global" stability, and it has found substantial anecdotal success in diagnosing unstable models. Unfortunately, for correlated groups of data points, the first-order approximation of the group influence is often an underestimate (Koh et al., 2019) , so large local stability does not actually certify that a model is provably stable to removing small subsets of data. In fact, stability certification is a challenging and open problem even in the simplest of models: linear regression via Ordinary Least Squares (OLS). Concretely, given a regression dataset, a natural metric for the stability of the OLS regressor is the minimum number of data points that need to be removed from the dataset to flip the sign of a particular coefficient of the regressor (e.g., in causal inference, the coefficient measuring the treatment effect). Recent work has used the local stability heuristic to diagnose unstable OLS regressions in several prominent economics studies (Broderick et al., 2020) , identifying examples where even a statistically significant conclusion can be overturned by removing less than 1% of the data points. However, the converse question of validating stable conclusions remains unaddressed: Given a regression dataset, can we efficiently certify non-trivial lower bounds on the stability of the OLS regressor? Our work takes steps towards addressing this question, via the following contributions: • We introduce a natural fractional relaxation of the above notion of OLS stability, where we allow removing fractions of data points, and seek to minimize the total removed weight. We call this finite-sample stability, and henceforth refer to the prior notion as "integral" stability. • We develop approximation algorithms for the finite-sample stability, with (a) provable guarantees under reasonable anti-concentration assumptions on the dataset, and (b) running time polynomial in the size of the dataset, so long as the dimension of the data is a constant (in contrast, the naive algorithm is exponential in the size of the dataset). Moreover, we prove that (at least for exact algorithms) exponential dependence of the running time on the dimension is unavoidable under standard complexity assumptions. • We use modifications of our algorithms to compute assumption-free upper and lower bounds on the finite-sample stability of several simple synthetic and real datasets, achieving tighter upper bounds than prior work and the first non-trivial lower bounds, i.e. certifications that the OLS regressor is stable. Why define stability this way? The definition of integral stability was introduced in (Broderick et al., 2020), along with several variants (e.g. smallest perturbation which causes the first coordinate to lose significance). We choose the definition based on the sign of the first coordinate, because it has clear practical interpretation-does the first covariate positively or negatively affect the response?which does not depend on choice of additional parameters such as significance level. We study the fractional relaxation so that the stability is defined by a continuous optimization problem. Note that certifying a lower bound on fractional stability immediately certifies a lower bound on the integral stability; we will see later (Remark 3.1) that a near-converse holds in low dimensions. Why is low-dimensional regression important? Given that much of machine learning happens in high-dimensional settings, where the number of covariates can even be larger than the number of datapoints, it is natural to wonder why low-dimensional settings are still important. First, in application areas such as econometrics, linear regressions with as few as two to four covariates are very common (Britto et al., 2022; Bianchi & Bigio, 2022; Hopenhayn et al., 2022) , often serving as proofs-of-concept for more complex models. Second, even in settings where the number of covariates is larger, it is often the expectation that few covariates are relevant. In such applications, analysis often consists of a variable selection step followed by regression on a much-reduced set of covariates (Cai & Wang, 2011) . In all these settings, understanding the stability of an estimator is important, and our work gives some of the first provable guarantees that avoid making strong distributional assumptions. Moreover our lower bounds show that certifying stability of truly high-dimensional models, even linear ones, is intractable.

1.1. FORMAL PROBLEM STATEMENT

We are given a deterministic and arbitrary set of n samples (X i , y i ) n i=1 , where each X i is a vector of d real-valued covariates, and each y i is a real-valued response. We are interested in a single coefficient of the OLS regressor (without loss of generality, the first coordinate): in an application, the first covariate may be the treatment and the rest may be controls. The sign of this coefficient

