ARMCMC: ONLINE MODEL PARAMETERS FULL PROBABILITY ESTIMATION IN BAYESIAN PARADIGM

Abstract

Although the Bayesian paradigm provides a rigorous framework to estimate the full probability distribution over unknown parameters, its online implementation can be challenging due to heavy computational costs. This paper proposes Adaptive Recursive Markov Chain Monte Carlo (ARMCMC) which estimates full probability density of model parameters while alleviating shortcomings of conventional online approaches. These shortcomings include: being solely able to account for Gaussian noise, being applicable to systems with linear in the parameters (LIP) constraint, or having requirements on persistence excitation (PE). In ARMCMC, we propose a variable jump distribution, which depends on a temporal forgetting factor. This allows one to adjust the trade-off between exploitation and exploration, depending on whether there is an abrupt change to the parameter being estimated. We prove that ARMCMC requires fewer samples to achieve the same precision and reliability compared to conventional MCMC approaches. We demonstrate our approach on two challenging benchmark: the estimation of parameters in a soft bending actuator and the Hunt-Crossley dynamic model. Our method shows at-least 70% improvement in parameter point estimation accuracy and approximately 55% reduction in tracking error of the value of interest compared to recursive least squares and conventional MCMC.

1. INTRODUCTION

Bayesian methods are powerful tools to not only obtain a numerical estimate of a parameter but also to give a measure of confidence (Kuśmierczyk et al., 2019; Bishop, 2006; Joho et al., 2013) . In particular, Bayesian inferences calculate the probability distribution of parameters rather than a point estimate, which is prevalent in frequentist paradigms (Tobar, 2018) . One of the main advantages of probabilistic frameworks is that they enable decision making under uncertainty (Noormohammadi-Asl & Taghirad, 2019) . In addition, knowledge fusion is significantly facilitated in probabilistic frameworks; different sources of data or observations can be combined according to their level of certainty in a principled manner (Agand & Shoorehdeli, 2019) . Nonetheless, Bayesian inferences require high computational effort for obtaining the whole probability distribution and prior general knowledge on noise distribution before estimation. One of the most effective methods for Bayesian inferences is the Markov Chain Monte Carlo (MCMC) methods. In the field of system identification, MCMC variants such as the one recently proposed by Green (2015) are mostly focused on offline system identification. This is partly due to computational challenges which prevent real-time use (Kuindersma et al., 2012) . The standard MCMC algorithm is not suitable for model variation since different candidates do not share the same parameter set. Green (1995) first introduced reversible jump Markov chain Monte Carlo (RJMCMC) as a method to address the model selection problem. In this method, an extra pseudo random variable is defined to handle dimension mismatch. There are further extensions of MCMC in the literature, however, an online implication of it has yet to be reported. Motion filtering and force prediction of robotic manipulators are important fields of study with interesting challenges suitable for Bayesian inferences to address (Saar et al., 2018) . Here, measurements are inherently noisy, which is not desirable for control purposes. Likewise, inaccuracy, inaccessibility, and costs are typical challenges that make force measurement not ideal for practical use (Agand et al., 2016) . Different environmental identification methods have been proposed in the literature for linear and Gaussian noise (Wang et al., 2018) ; however, in cases of nonlinear models like Hunt-Crossley that does not have Gaussian noise (e.g. impulsive disorder), there is no optimal solution for the identification problem. Diolaiti et al. (2005) proposed a double-stage bootstrapped method for online identification of the Hunt-Crossley model, which is sensitive to parameter initial conditions. Carvalho & Martins (2019) proposed a method to determine the damping term in the Hunt-Crossley model. A neural network-based approach was introduced to control the contact/non-contact Hunt-Crossley model in (Bhasin et al., 2008) This paper proposes a new technique, Adaptive Recursive Markov Chain Monte Carlo (ARMCMC), to address address certain weaknesses of traditional online identification methods, such as only being appllicable to systems Linear in Parameter having Persistent Excitation (PE) requirements, and assuming Gaussian noise. ARMCMC is an online method that takes advantage of the previous posterior distribution, given that there is no sudden change in the parameter distribution. To achieve this, we define a new variable jump distribution that accounts for the degree of model mismatch using a temporal forgetting factor. The temporal forgetting factor is computed from a model mismatch index and determines whether ARMCMC employs modification or reinforcement to either restart or refine parameter distribution. As this factor is a function of the observed data rather than a simple user-defined constant, it can effectively adapt to the underlying dynamics of the system. We demonstrate our method using two different examples: soft bending actuator and Hunt-Crossley model and show favorable performance compared to state-of-the-art baselines. The rest of this paper is organized as follows: In Sec. 2, introductory context about the Bayesian approach and MCMC is presented. Sec 3 is devoted to presenting the proposed ARMCMC approach in a step-by-step algorithm. Simulation results on a soft bending actuator with empirical results on a reality-based model of a soft contact environment capturing a Hunt-Crossley dynamic are presented in Sec. 4. Lastly, the final remarks and future directions are concluded in Sec 5.

2. PRELIMINARIES 2.1 PROBLEM STATEMENT

In the Bayesian paradigm, estimates of parameters are given in the form of the posterior probability density function (pdf); this pdf can be continuously updated as new data points are received. Consider the following general model: Y = F (X, θ) + ν, where Y , X, θ, and ν are concurrent output, input, model parameters and noise vector, respectively. To calculate the posterior probability, the observed data along with a prior distribution are combined via Bayes' rule (Khatibisepehr et al., 2013) . The data includes input/output data pairs (X, Y ). We will be applying updates to the posterior pdf using batches of data points; hence, it will be convenient to partition the data as follows: D t = {(X, Y ) tm , (X, Y ) tm+2 , • • • , (X, Y ) tm+Ns+1 }, where N s = T s /T is the number of data points in each data pack with T, T s being the data and algorithm sampling times, respectively. This partitioning is convenient for online applications, as D t-1 should have been previously collected so that the algorithm can be executed from t m to t m + N s +1 or algorithm time step t. Ultimately, inferences are completed at t m +N s +2. Fig. 1 illustrates the timeline for the data and the algorithm. It is worth mentioning that the computation can be done in parallel by rendering the task of the adjacent algorithm step (e.g. phase A of algorithm t, phase B of algorithm t -1 and phase C of algorithm t -2 can all be done simultaneously) According to Bayes' rule and assuming data points are independent and identically distributed ( i.i.d) in equation 1, we have P (θ t |[D t-1 , D t ]) = P D t |θ t , D t-1 P (θ t |D t-1 ) P D 1 |θ t , D t-1 P (θ t |D t-1 )dθ t , where θ t denotes the parameters at current time step. P (θ t |D t-1 ) is the prior distribution over parameters, which is also the posterior distribution at the previous algorithm sampling time. P D t |θ t , D t-1 is the likelihood function which is obtained by the one-step-ahead prediction: Ŷ t|t-1 = F (D t-1 , θ t ),

