Solving stochastic weak Minty variational inequalities without increasing batch size

Abstract

This paper introduces a family of stochastic extragradient-type algorithms for a class of nonconvex-nonconcave problems characterized by the weak Minty variational inequality (MVI). Unlike existing results on extragradient methods in the monotone setting, employing diminishing stepsizes is no longer possible in the weak MVI setting. This has led to approaches such as increasing batch sizes per iteration which can however be prohibitively expensive. In contrast, our proposed methods involves two stepsizes and only requires one additional oracle evaluation per iteration. We show that it is possible to keep one fixed stepsize while it is only the second stepsize that is taken to be diminishing, making it interesting even in the monotone setting. Almost sure convergence is established and we provide a unified analysis for this family of schemes which contains a nonlinear generalization of the celebrated primal dual hybrid gradient algorithm.

1. Introduction

Stochastic first-order methods have been at the core of the current success in deep learning applications. These methods are mostly well-understood for minimization problems at this point. This is even the case in the nonconvex setting where there exists matching upper and lower bounds on the complexity for finding an approximately stable point (Arjevani et al., 2019) . The picture becomes less clear when moving beyond minimization into nonconvex-nonconcave minimax problems-or more generally nonmonotone variational inequalities. Even in the deterministic case, finding a stationary point is in general intractable (Daskalakis et al., 2021; Hirsch & Vavasis, 1987) . This is in stark contrast with minimization where only global optimality is NP-hard. An interesting nonmonotone class for which we do have efficient algorithms is characterized by the so called weak Minty variational inequality (MVI) (Diakonikolas et al., 2021) . This problem class captures nontrivial structures such as attracting limit cycles and is governed by a parameter ρ whose negativity increases the degree of nonmonotonicity. It turns out that the stepsize γ for the exploration step in extragradient-type schemes lower bounds the problem class through ρ > -γ /2 (Pethick et al., 2022) . In other words, it seems that we need to take γ large to guarantee convergence for a large class. This reliance on a large stepsize is at the core of why the community has struggled to provide a stochastic variants for weak MVIs. The only known results effectively increase the batch size at every iteration (Diakonikolas et al., 2021, Thm. 4 .5)-a strategy that would be prohibitively expensive in most machine learning applications. Pethick et al. ( 2022) proposed (SEG+) which attempts to tackle the noise by only diminishing the second stepsize. This suffices in the special case of unconstrained quadratic games but can fail even in the monotone case as illustrated in Figure 1 . This naturally raises the following research question: Can stochastic weak Minty variational inequalities be solved without increasing the batch size? We resolve this open problem in the affirmative when the stochastic oracles are Lipschitz in mean, with a modification of stochastic extragradient called bias-corrected stochastic extragradient (BC-SEG+). The scheme only requires one additional first order oracle call, while crucially maintaining the fixed stepsize. Specifically, we make the following contributions:

