ASSOCIATION RULES IN QUBO SAMPLES AND WHERE TO FIND THEM

Abstract

There are sometimes strong associations between variables in the samples to a Quadratic Unconstrained Binary Optimization (QUBO) problem. A natural question arises to us: Are there any value in these association? We study max-cut problem and observe that association can be represented as rules to simplify QUBO problem. Classical and quantum annealers work better when the problem size is smaller. To effectively and efficiently find associations between variables, we adapt traditional association rule mining in the case of QUBO samples and propose a Fast Association Rule Mining algorithm (FARM) specifically for mining QUBO samples. We also propose strategies and a workflow to select and apply promising rules and simplify QUBO problems. We evaluate our method on D-Wave Quantum Annealer as well as Fujitsu Digital Annealer. The experiments demonstrate the utility of FARM as a visualisation tool for understanding associations in QUBO samples. The results also demonstrate the potential of our method in closing the gap between samples and ground truth. The source code will be disclosed to the public if the manuscript is accepted.

1. INTRODUCTION

Many combinatorial optimization problems can be formulated as a quadratic unconstrained binary optimization problem (QUBO) Lucas (2014), Glover et al. (2018) . QUBO corresponds naturally to the transverse Ising model and benefits from the speed up by quantum annealing Kadowaki & Nishimori (1998) . The combination of QUBO and Quantum annealing Kadowaki & Nishimori (1998) has proved its utility in various applications, such as Traveling Salesman Problem (TSP) Martoňák et al. (2004) , Graph colouring Titiloye & Crispin (2011 ) Portfolio Optimization Venturelli & Kondratyev (2019 ) and Resource Scheduling Problem Ikeda et al. (2019) . Annealing, e.g., simulated annealingKirkpatrick et al. (1983) , is a family of probabilistic methods for optimising the variables of a system (e.g. minimising a function). In annealing, we heat the system to a high-temperature level and gradually bring down the temperature. Variables in the system gradually lose their energy and eventually sit in a low-energy state. One round of heating and cooling, i.e., an annealing process, produces one sample, which is a configuration of all variables. The annealing process is random. The energy distribution of the samples follows a Boltzmann distributionNelson et al. (2022), i.e., the system is more likely to be in a lower energy state. Quantum annealing works similarly, except that it makes use of quantum mechanics. We can use annealing for optimisation in a sampling manner. While we cannot find optimal solution in one shot, one possible way to obtain a promising solution is to collect more samples. However, the marginal profit of "more samples" decreases as we collect more samples. Besides, access to quantum devices is expensive. We are looking for a more efficient method to make additional samples more productive. The main idea of the paper is that we examine existing samples to discover interesting rules and simplify the original QUBO problem. More specifically, if most of the variables agrees on certain decision, then most likely these variables have been solved correctly. We should then solve the remaining variables that we are less certain of. There are some earlier works in this domain, with their own limitations. Lewis & Glover (2017) invented a few rules to transform the underlying QUBO structure into an equivalent graph with fewer nodes and edges. However, such rules are hand-crafted. A question of interest is whether we can discover such rules that could be specific to a particular application. Chardaire et al. (1995) ; Karimi & Rosenberg (2017); Irie et al. (2021) utilise sample persistency, which concerns the persistency of individual variables, and omits the associations between variables. These methods do not work well if persistency is not explicitly presented in samples. For example, in a max-cut problem, each sample and its flipped version are equivalent, both having equal chances of presence in sampling. Therefore, no variable in max-cut is going to be persistent. Other problems like number partitioning, graph colouring, and TSP, also share a similar feature. In this paper, we propose to use heuristic method to discover association rules between variables. The concept of traditional Assocation Rule Mining (ARM) fits very well with our purpose, as it is originally designed for discovering associations between items in Market Basket Analysis (MBA) Agrawal et al. (1993) . Through adapting traditional ARM in the mining task of QUBO samples, we can discover associations between variables, reveal hidden persistency in samples, and simplify QUBO problems accordingly. By solving a smaller QUBO problem, we stand a better chance of obtaining more promising results. Our contributions can be summarised as follows. • We adapt association rule mining to the case of mining rules from QUBO samples. We introduce the concept of equivalent alternatives in the definition to couple with the special requirements of applications like max-cut. • We explore effective ways to select and apply rules to reduce QUBO problems. • We propose Fast Association Rule Mining (FARM) algorithm for mining QUBO samples, to efficiently discover rules of wide range of support (typically 10-90%). In the experiments, we evaluate our definitions and methods on two solvers, D-Wave Pegasus Quantum Annealer Willsch et al. ( 2022) (QA) and Fujitsu 2nd Gen Digital Annealer Aramon et al. (2019) (DA). We use max-cut problems as a case study and include two datasets with different topologies in the experiment. Results on QA suggest that FARM can serve as a visualisation tool to help understand associations between variables. By fixing variables according to the discovered rules, we can simplify a QUBO problem and find more promising solutions in a second run of solvers. Results on DA suggest that classical annealing-based solvers produce very low level of persistency on large problems, and receives very limited benefits from the proposed method. Comparing the results on QA and DA, we know that association rule has more potential in QA. The rest of the paper is organised as follows. Section 2 reviews existing literature and describes the position of this work. Section 3 introduces the concept of association rule in the case of mining QUBO samples and presents a workflow and a few strategies that apply the discovered rule to simplify a QUBO problem. Section 4 describes the assumption and design ideas of Fast Association Rule Mining (FARM) for QUBO samples. Section 5 quantitatively compares our methods with a representative baseline method. Section 6 draws the conclusion.

2.1. OPTIMISATION

A QUBO minimisation problem can be written in the form of min x∈{0,1} n x T Qx, where x is a vector of decision variables, Q is the QUBO matrix. Sometimes, the answers to some variables are trivial, such that their value remains unchanged in the optimal solution, regardless of the values taken by other variables. This is called variable persistence in Boros et al. (2006) . We can assign the values so that the variables are fixed. By this partial assignment, we effectively reduce the number of variables, and in turn, reduce or simplify the QUBO problem. There are a few ways to identify such partial assignments. One way is to check the QUBO matrix against a set of rules Lewis & Glover (2016; 2017); Glover et al. (2018) . These include manually crafted rules for assigning values to individual variables and a pair of variables. This method is designed as preprocessor to reduce the size of a problem before passing it to any solvers. The preprocessor "QPro" scans through a QUBO matrix and apply the rules in an iterative way until no further variables can be assigned values. Experiments suggest that QPro works effectively on a synthetic dataset with sparse connectivity and non-uniform distributions Glover et al. ( 2018) and scientific dataset Şeker et al. ( 2020), but failed to yield any reduction on the dense and uniformly distributed ORLIB 2500 variable problems Beasley (2010) and a real-world problem Kim et al.

