MONOTONIC KRONECKER-FACTORED LATTICE

Abstract

It is computationally challenging to learn flexible monotonic functions that guarantee model behavior and provide interpretability beyond a few input features, and in a time where minimizing resource use is increasingly important, we must be able to learn such models that are still efficient. In this paper we show how to effectively and efficiently learn such functions using Kronecker-Factored Lattice (KFL), an efficient reparameterization of flexible monotonic lattice regression via Kronecker product. Both computational and storage costs scale linearly in the number of input features, which is a significant improvement over existing methods that grow exponentially. We also show that we can still properly enforce monotonicity and other shape constraints. The KFL function class consists of products of piecewise-linear functions, and the size of the function class can be further increased through ensembling. We prove that the function class of an ensemble of M base KFL models strictly increases as M increases up to a certain threshold. Beyond this threshold, every multilinear interpolated lattice function can be expressed. Our experimental results demonstrate that KFL trains faster with fewer parameters while still achieving accuracy and evaluation speeds comparable to or better than the baseline methods and preserving monotonicity guarantees on the learned model.

1. INTRODUCTION

Many machine learning problems have other requirements in addition to accuracy, such as efficiency, storage, and interpretability. For example, the ability to learn flexible monotonic functions at scale is useful in practice because machine learning practitioners often know apriori which input features positively or negatively relate to the output and can incorporate these hints as inductive bias in the training to further regularize the model (Abu-Mostafa, 1993) and guarantee its expected behavior on unseen examples. It is, however, computationally challenging to learn such functions efficiently. In this paper, we extend the work of interpretable monotonic lattice regression (Gupta et al., 2016) to significantly reduce computational and storage costs without compromising accuracy. While a linear model with nonnegative coefficients is able to learn simple monotonic functions, its function class is restricted. Prior works proposed non-linear methods (Sill, 1997; Dugas et al., 2009; Daniels & Velikova, 2010; Qu & Hu, 2011) to learn more flexible monotonic functions, but they are shown to work only in limited settings of small datasets and low dimensional feature spaces. Monotonic lattice regression (Gupta et al., 2016) , an extension of lattice regression (Garcia et al., 2012) , learns an interpolated look-up table with linear inequality constraints that impose monotonicity. This has been demonstrated to work with millions of training examples and achieve competitive accuracy against, for example, random forests; however, because the number of model parameters scales exponentially in the number of input features, it is difficult to apply such models in high dimensional feature space settings. To overcome this limitation, (Canini et al., 2016) incorporated ensemble learning to combine many tiny lattice models, each capturing non-linear interactions among a small random subset of features. This paper proposes Kronecker-Factored Lattice (KFL), a novel reparameterization of monotonic lattice regression via Kronecker product to achieve significant parameter efficiency and simultaneously provide guaranteed monotonicity of the learned model with respect to a user-prescribed set of input

