RANDOM COORDINATE LANGEVIN MONTE CARLO

Abstract

Langevin Monte Carlo (LMC) is a popular Markov chain Monte Carlo sampling method. One drawback is that it requires the computation of the full gradient at each iteration, an expensive operation if the dimension of the problem is high. We propose a new sampling method: Random Coordinate LMC (RC-LMC). At each iteration, a single coordinate is randomly selected to be updated by a multiple of the partial derivative along this direction plus noise, and all other coordinates remain untouched. We investigate the total complexity of RC-LMC and compare it with the classical LMC for log-concave probability distributions. When the gradient of the log-density is Lipschitz, RC-LMC is less expensive than the classical LMC if the log-density is highly skewed for high dimensional problems, and when both the gradient and the Hessian of the log-density are Lipschitz, RC-LMC is always cheaper than the classical LMC, by a factor proportional to the square root of the problem dimension. In the latter case, our estimate of complexity is sharp with respect to the dimension.

1. INTRODUCTION

Monte Carlo sampling plays an important role in machine learning (Andrieu et al., 2003) and Bayesian statistics. In applications, the need for sampling is found in atmospheric science (Fabian, 1981) , epidemiology (Li et al., 2020 ), petroleum engineering (Nagarajan et al., 2007) , in the form of data assimilation (Reich, 2011 ), volume computation (Vempala, 2010) and bandit optimization (Russo et al., 2018) . In many of these applications, the dimension of the problem is extremely high. For example, for weather prediction, one measures the current state temperature and moisture level, to infer the flow in the air, before running the Navier-Stokes equations into the near future (Evensen, 2009) . In a global numerical weather prediction model, the degrees of freedom in the air flow can be as high as 10 9 . Another example is from epidemiology: When a disease is spreading, one measures the everyday new infection cases to infer the transmission rate in different regions. On a county-level modeling, one treats 3, 141 different counties in the US separately, and the parameter to be inferred has a dimension of at least 3, 141 (Li et al., 2020) . In this work, we focus on Monte Carlo sampling of log-concave probability distributions on R d , meaning the probability density can be written as p(x) ∝ e -f (x) where a f (x) is a convex function. The goal is to generate (approximately) i.i.d. samples according to the target probability distribution with density p(x). Several sampling frameworks have been proposed in the literature, including importance sampling and sequential Monte Carlo (Geweke, 1989; Neal, 2001; Del Moral et al., 2006) ; ensemble methods (Reich, 2011; Iglesias et al., 2013) ; Markov chain Monte Carlo (MCMC) (Roberts and Rosenthal, 2004) , including Metropolis-Hasting based MCMC (MH-MCMC) (Metropolis et al., 1953; Hastings, 1970; Roberts and Tweedie, 1996) ; Gibbs samplers (Geman and Geman, 1984; Casella and George, 1992); and Hamiltonian Monte Carlo (Neal, 1993; Duane et al., 1987) . Langevin Monte Carlo (LMC) (Rossky et al., 1978; Parisi, 1981; Roberts and Tweedie, 1996) is a popular MCMC method that has received intense attention in recent years due to progress in the non-asymptotic analysis of its convergence properties (Durmus and Moulines, 2017; Dalalyan, 2017; Dalalyan and Karagulyan, 2019; Durmus et al., 2019) . Denoting by x m the location of the sample at m-th iteration, LMC obtains the next location as follows: x m+1 = x m -∇f (x m )h + √ 2hξ m d ,

