HIGH-PRECISION REGRESSORS FOR PARTICLE PHYSICS

Abstract

Monte Carlo simulations of physics processes at particle colliders like the Large Hadron Collider at CERN take up a major fraction of the computational budget. For some simulations, a single data point takes seconds, minutes, or even hours to compute from first principles. Since the necessary number of data points per simulation is on the order of 10 9 -10 12 , machine learning regressors can be used in place of physics simulators to significantly reduce this computational burden. However, this task requires high-precision regressors that can deliver data with relative errors of less than 1% or even 0.1% over the entire domain of the function. In this paper, we develop optimal training strategies and tune various machine learning regressors to satisfy the high-precision requirement. We leverage symmetry arguments from particle physics to optimize the performance of the regressors. Inspired by ResNets, we design a Deep Neural Network with skip connections that outperform fully connected Deep Neural Networks. We find that at lower dimensions, boosted decision trees far outperform neural networks while at higher dimensions neural networks perform significantly better. We show that these regressors can speed up simulations by a factor of 10 3 -10 6 over the firstprinciples computations currently used in Monte Carlo simulations. Additionally, using symmetry arguments derived from particle physics, we reduce the number of regressors necessary for each simulation by an order of magnitude. Our work can significantly reduce the training and storage burden of Monte Carlo simulations at current and future collider experiments.

1. INTRODUCTION

Particle physics experiments like those at the Large Hadron Collider at CERN, are running at progressively higher energies and are collecting more data than ever before. As a result, the experimental precision of the measurements they perform is continuously improving. However, to infer what these measurements mean for the interactions between the fundamental constituents of matter, they have to be compared with and interpreted in light of, our current theoretical understanding. This is done by performing first-principles computations for these high energy processes order by order in a power series expansion. After the computation, the resulting function is used in Monte Carlo simulations. The successive terms in the power series expansion, simplistically, become progressively smaller. Schematically, this can be written as: F (x) = f 00 (x) + α f 01 (x) + α 2 {f 11 (x) + f 02 (x)} + . . . . where α ≪ 1 is the small expansion parameter. The term of interest to our current work is the one enclosed by the curly braces in equation (1) which we will refer to as the second-order termfoot_0 . The function, F (x), must be evaluated on the order of 10 9 -10 12 times for each simulation. However, for many processes, evaluating the second-order term, specifically, f 02 , is computationally spaceand time-intensive and could take several seconds to compute a single data point. Moreover, these samples cannot be reused leading to an overall high cost of computation for the entire process under consideration. Building surrogate models to speed up Monte Carlo simulations is highly relevant not only in particle physics but in a very large set of problems addressed by all branches of physics using perturbative expansion like the one in equation ( 1). We give a broader overview of the physics motivation and applications in appendix A. A simple solution to speed up the computation of the functions is to build a regressor using a representative sample. However, to achieve the precision necessary for matching with experimental results, the regressors need to produce very-high accuracy predictions over the entire domain of the function. The requirements that we set for the regressors, and in particular what we mean by high precision, are: High precision: prediction error < 1% over more than 90% of the domain of the function Speed : prediction time per data point of < 10 -4 seconds Lightweight : the disk size of the regressors should be a few megabytes at the most for portability In this work we explore the following novel concepts: • With simulated data from real physics processes occurring in particle colliders, we study the error distributions over the entire input feature spaces for multi-dimensional distributions when using boosted decision trees (BDT), Deep Neural Networks (DNN) and Deep Neural Networks with skip connections (sk-DNN). • We study these regressors for 2, 4, and 8 dimensional (D) data making comparisons between the performance of BDTs, DNN and sk-DNNs with the aim of reaching errors smaller than 1% -0.1% over at least 90% of the input feature space. • We outline architectural decisions, training strategies and data volume necessary for building these various kinds of high-precision regressors. In what follows, we will show that we can reduce the compute time of the most compute-intensive part, f 11 (x) + f 02 (x) (defined in equation ( 1)), by several orders of magnitude, down from several seconds to sub-milliseconds without compromising the accuracy of prediction. We show that physics-motivated normalization strategies, learning strategies, and invocation of physics symmetries will be necessary to achieve the goal of high precision. In our experiments, the BDTs outperform the DNNs for lower dimensions while the DNNs give comparable (for 4D) or significantly better (for 8D) accuracy at higher dimensions. DNNs with skip connections perform comparably with fully connected DNNs even with much fewer parameters and outperform DNNs of equivalent complexity. Moreover, DNNs and sk-DNNs meet and exceed the high-precision criteria with 8D data while BDTs fail. Our goal will be to make the most lightweight regressor for real-time prediction facilitating the speed-up of the Monte Carlo simulation.

2. RELATED WORK

Building models for the regression of amplitudes has been a continued attempt in the particle physics literature in the recent past. boosted decision trees (BDTs) have been the workhorse of particle physics for a long time but mostly for performing classification of tiny signals from dominating backgrounds (Radovic et al., 2018) . However, the necessity to use BDTs as a regressor for theoretical estimates of experimental signatures has only been advocated recently (Bishara & Montull, 2019) and has been shown to achieve impressive accuracy for 2D data. GANs (Goodfellow et al., 2014; Springenberg, 2016; Brock et al., 2018) and VAEs (Brock et al., 2018) have been used for sample generation (Butter et al., 2021; Otten et al., 2021) . Similar applications have surfaced in other domains of physics where Monte Carlo simulations are used. Self-learning Monte Carlo methods have been explored by Liu et al. (2017) . Applications of



Here order refers to the power of the expansion coefficient α.



Several other machine learning algorithms have been used for speeding up sample generation for Monte Carlo simulations.Winterhalder et al. (2022)  proposed the use of Normalizing Flows(Jimenez Rezende & Mohamed, 2015)  with Invertible Neural Networks to implement importance sampling(Müller et al., 2018; Ardizzone et al., 2018). Recently, neural network surrogates have been used to aid Monte Carlo Simulations of collider processes(Danziger et al., 2022).Badger et al. (2022)  used Bayesian Neural networks for regression of particle physics amplitudes with a focus on understanding error propagation and estimation.Chen et al. (2021)  attempted to reach the high-precision regime with neural networks and achieved 0.7% errors integrated over the entire input feature space. Physics-aware neural networks were studied byMaître & Truong (2021)  in an attempt to handle singularities in the regressed functions. In the domain of generative models,

