UNSUPERVISED LEARNING FOR COMBINATORIAL OP-TIMIZATION NEEDS META LEARNING

Abstract

A general framework of unsupervised learning for combinatorial optimization (CO) is to train a neural network whose output gives a problem solution by directly optimizing the CO objective. Albeit with some advantages over traditional solvers, current frameworks optimize an averaged performance over the distribution of historical problem instances, which misaligns with the actual goal of CO that looks for a good solution to every future encountered instance. With this observation, we propose a new objective of unsupervised learning for CO where the goal of learning is to search for good initialization for future problem instances rather than give direct solutions. We propose a meta-learning-based training pipeline for this new objective. Our method achieves good performance. We observe that even the initial solution given by our model before fine-tuning can significantly outperform the baselines under various evaluation settings including evaluation across multiple datasets, and the case with big shifts in the problem scale. The reason we conjecture is that meta-learning-based training lets the model be loosely tied to each local optimum for a training instance while being more adaptive to the changes of optimization landscapes across instances. 1 

1. INTRODUCTION

Combinatorial optimization (CO), aiming to find out the optimal solution from discrete search space, has a pivotal position in scientific and engineering fields (Papadimitriou & Steiglitz, 1998; Crama, 1997) . Most CO problems are NP-complete or NP-hard. Conventional heuristics or approximation requires insightful comprehension of the particular problem. Starting from the seminal work from Hopfield & Tank (1985) , researchers apply neural networks (NNs) (Smith, 1999; Vinyals et al., 2015) to solve CO problems. The motivation is that NNs may learn heuristics through solving historical problems, which could be useful to solve similar problems in the future. Many NN-based methods (Selsam et al., 2018; Joshi et al., 2019; Hudson et al., 2021; Gasse et al., 2019; Khalil et al., 2016) require optimal solutions to the CO problem as supervision in training. However, optimal solutions are hard to get in practice and the obtained model often does not generalize well (Yehuda et al., 2020) . Methods based on reinforcement learning (RL) (Mazyavkina et al., 2021; Bello et al., 2016; Khalil et al., 2017; Yolcu & Póczos, 2019; Chen & Tian, 2019; Yao et al., 2019; Kwon et al., 2020; 2021; Delarue et al., 2020; Nandwani et al., 2021) do not need labels while they often suffer from notoriously unstable training. Recently, unsupervised learning methods have attracted much attention (Toenshoff et al., 2021; Amizadeh et al., 2018; Yao et al., 2019; Karalias & Loukas, 2020; Wang et al., 2022) . A common strategy of these methods is to design an NN whose output gives a solution to the CO problem and then train the NN via gradient descent by directly optimizing the CO objectives over a set of training instances. This strategy is superior in its faster training, good generalization, and strong capability of dealing with large-scale problems. Despite the prominent progress, current unsupervised learning methods always optimize NNs towards an averaged good performance over training instances. This means even if a testing instance comes from the same distribution of the training instances, the solution to this single instance may not have good quality, let alone the case when the testing instance is out-of-distribution (OOD). This induces a concern when we apply NNs in practice because practical problems often expect to have a good solution to every encountered instance. For example, allocating surveillance cameras is crucial for each-time exhibition in every art gallery. Solvers when applied to this problem (O'rourke, 1987; Yabuta & Kitazawa, 2008) should output a good solution every time. Traditional CO solvers are designed toward this goal. However, it is time-consuming and unable to learn heuristics from historical instances. So, can we leverage the benefit of learning from history with the goal of achieving an instance-wise good solution instead of an averaged good solution? This motivates us to study a new formulation of unsupervised learning for CO. We regard the objective of learning from history as to search for a good initialization for each future instance rather than give a direct solution. Since in practice, future instances are unavailable during the training stage, we propose to view each training instance as a pseudo-new instance for the rest training instances. Then, our learning objective is to learn a good initialization of this model, such that further optimization of the initialization could achieve good solutions on each of these pseudo-new instances. We observe meta learning is suitable to implement this idea and propose to adopt MAML (Finn et al., 2017) in our training pipeline as a proof of concept. Note that the step of optimization on each pseudo-new instance shares a similar spirit with fine-tuning a model over each down-streaming task as traditional meta learning does. However, each task in our case corresponds to optimization over each training instance. We name our method Meta-EGN by extending the previous framework EGN (Karalias & Loukas, 2020) via meta learning. Our key observation is that with this new objective, even the initial solution given by Meta-EGN (before fine-tuning on a test instance) is substantially better than the solution given by EGN and other methods that optimize the averaged performance over training instances. Our conjectured reason is that the new objective, by taking into account fine-tuning the model over new instances, trains the model to avoid being trapped into a local minimum induced by each training instance while being more adaptive to the changes of optimization landscapes across instances. We demonstrate the benefits of Meta-EGN via experiments within three benchmark CO problems (max clique, vertex cover, and max independent set) on multiple synthetic graphs and three realworld graph datasets, with the number of nodes ranging from 100 to 5000. ) in the max independent set (MIS) problem on large-scaled random-regular graphs (RRGs), which raises attentions from machine learning community. We observe the issues come from two aspects: (1) graph neural networks (GNNs) used to encode the regular graph suffer from the node ambiguity issue due to their limited expressive power (Xu et al., 2019) ; (2) the model in (Schuetz et al., 2022) did not learn from history but was directly optimized over each testing case, which tends to be trapped into a local optimum. By addressing these two issues, Meta-EGN can consistently outperform DGA while maintaining the same time complexity to generate solutions. Fig. 1 show the results.



Our code is available at: https://github.com/Graph-COM/Meta_CO



Figure 1: Approximation Rates of different methods in the MIS problem. Meta-EGN and EGN (Karalias & Loukas, 2020) are trained on RRGs with 1000 nodes and with node degree randomly sampled from 3, 7, 10, 20. Meta-EGN and EGN are evaluated over larger RRGs with 10 3 ∼ 10 5 nodes. More details about the setting are in Secs. 5.1 and 5.4. Meta-EGN outperforms DGA (Angelini & Ricci-Tersenghi, 2019) by about 0.3% -0.5% in approximation rates on average.

