CREDIBLE, SEALED-BID, OPTIMAL REPEATED AUC-TIONS WITH DIFFERENTIABLE ECONOMICS

Abstract

Online advertisement auctions happen billions of times per day. Bidders in auctions strategize to improve their own utility, subject to published auctions' rules. Yet, bidders may not know that an auction has been run as promised. A credible auction is one in which bidders can trust the auctioneer to run its allocation and pricing mechanisms as promised. It is known that, assuming no communication between bidders, no credible, sealed-bid, and incentive compatible (aka "truthtelling" or otherwise truthful-participation-incentivizing) mechanism can exist. In reality, bidders can certainly communicate, so what happens if we relax this (typically unrealistic) constraint? In this work, we propose a framework incorporating cryptography to allow computationally-efficient, credible, revenue-maximizing (aka "optimal") auctions in a repeated auction setting. Our contribution is two-fold: first, we introduce a protocol for running repeated auctions with a verification scheme, and we show such a protocol can eliminate the auctioneer's incentive to deviate while costing negligible additional computation. Secondly, we provide a method for training optimal auctions under uncertain bidder participation profiles, which generalizes our protocol to a much wider class of auctions. Our empirical results show strong support for both the theory and competency of the proposed method.

1. INTRODUCTION

The problem of designing optimal, or revenue-maximizing, auctions bears significant theoretical and practical importance in economics: every Google search involves a sponsored search auctionfoot_0 , webpage views involve real time auctions for ads, and online platforms like Ebay and Amazon have created markets ran by auctions. This problem is non-trivial: the auctioneer's revenue is dependent on the "best response" strategy of each bidder, which can each be dependent on each other. In his Nobel-prize-winning work, Myerson showed the n-bidder, 1-item optimal auction can be solved by essentially computing a virtual bid for each bidder, then maximizing welfare Myerson (1981) ; Daskalakis (2015) . What about multi-item auctions? This has been shown to be no easy task, one clear reason for this difficulty is the size of the bundling space which grows exponentially. Additionally, an auctioneer may set reserve prices or draw lotteries to earn additional revenue. In essence, the optimal auction can be weird and "defying intuition" Daskalakis (2015) . Given no analytical solution have been found in designing the optimal multi-item auction, Daskalakis et al. (2014) have turned towards the complexity of this problem. They demonstrated that, under reasonable assumptions, finding the optimal multi-item auction is #P-hard. This has motivated the line of work called "differentiable economics" that focus on using machine learning to find desirable solutions to mechanism design problems Dütting et al. (2019) , which includes auction design. Differentiable economics approaches consider an auction as a function that takes bids as inputs and returns what item is allocated to who and how much each bidder pays. This function is usually encoded as a neural network, which can be backpropagated on given a differentiable loss function. The loss function is parameterized by the revenue, incentive compatibility-which we will provide a definition and discuss in more detail in later sections-or other desirable properties of the auction Peri et al. (2021) ; Kuo et al. (2020) . Although differentiable economics is a newlyemerged field, recent progress Dütting et al. (2019) ; Rahme et al. (2020; 2021a) ; Curry et al. (2021) show that it may be the most promising method for approximating optimal multi-item auctions. Besides optimality, credibility of the auction is another major consideration. Consider a sealed-bid, one-item, second-price auctionfoot_1 being run between bidder 1 and bidder 2, whereas bidder 1 has valuation of $2 and bidder 2 has valuation of $3. Acting in their best strategy, bidder 1 and bidder 2 bid $2 and $3, and the auctioneer should then allocate the item to bidder 2 charging them $2. However, since the auction is sealed-bid, the auctioneer can tell bidder 2 that they won the auction and bider 1 bid $2.99, which would increase the auctioneer's revenue by $0.99. An auction is said to be credible if the auctioneer has no incentive not to stick to their proposed auction. The significance of auction credibility was brought to light when Google was called out for gaming their proposed second-price online ad auction Schiff (2022) . It is known that there exists no sealed-bid, incentive-compatible, and credible auction if communication between bidders is precluded Akbarpour & Li (2020) . The authors of this work admit that modern cryptography along with bidder communication can potentially break the trilemma, but they consider the costs, in terms of computing resources and latency, of cryptographic constrictions too high. In this paper, we propose an approach for running repeated auctions that greatly reduces the cost of a credible multi-item auction when verified either by revealing bids (i.e. we greatly reduce the number of bids that must be revealed) or using a cryptographic mechanism such as zero-knowledge proofs. First, we show that in a sequence of repeated auctions, we need not run the verification mechanism for every round. Instead, we can punish deviations with a penalty that when high enough, can prevent the auctioneer from being untruthful even when only a random set of auctions are audited. The repetition of auctions naturally brings up an issue regarding bidder participation which previous works in differentiable economics did not have to deal with, as it is unrealistic to assume that the same set of bidders participates from start to end in repeated auctions. We address this issue by proposing a model that takes account of bidder participation uncertainty, and we provide a method to extend previous works in differentiable economics to this model, which we support with experimental results.

2. RELATED WORK

Neural networks. RegretNet Dütting et al. (2019) was the first work to train incentive compatible auctions to maximize revenue using deep learning. RegretNet has two components: the allocation net and the payment net. Each network treats the corresponding part of the auction as a function, taking the bids from the bidders and outputting the allocation/payment. There are various version of other networks developed on the basis of RegretNet to cover specific needs. Peri et al. (2021) considers possible human preference in the allocation process. Kuo et al. (2020) focuses on improving the fairness of the auction mechanisms. There are also works focusing on improving the accuracy and efficiency of RegretNet. Rahme et al. (2020) proposed ALGNet as a more efficient version of RegretNet, which considers auction design as a auctioneer versus bidders adversarial model. Verification tools. To prevent the auctioneer from deviating, we need some verification method that does not reveal additional information. Angel & Walfish (2013) proposed a cryptographic verification system called VEX that can be efficiently applied to second price auctions. In VEX, the auctioneer acts as the prover and the bidders act as the queriers. Under some given algorithm, the queriers can verify what the prover proposed is correct without information leakage in a considerable amount of time. More generally, Liu et al. (2021) has proposed zero-knowledge proof structures that can work for neural networks, and Mishra et al. (2020) has described a cryptography system which can also be applied to neural networks. With all that in mind, we can be confident that it is realistic to introduce verification tools during auction design. There also exist concrete work on granting credibility in auction design for specific scenarios. Ferreira & Weinberg (2020) finds a credible and optimal auction for MHR valuations with commitment, and Essaidi et al. (2022) extends that result to more general distributions of valuations. What separates this paper from theirs is that sealed-bid is not a consideration in these two papers, and our work focuses on a more general setting for neural network-encoded auctions.

3.1. AUCTION MODEL

We design a repeated auction with a set of n bidders N = {1, 2, ..., n} and m items M = {1, 2, ..., m} over some time horizon T ∈ T, in which the bidders know how many bidders there are. This repeated auction does not partition one auction into several auctions that each sell a subset of the original items. Rather, each round of our repeated auction completes a multi-item auction, then the auctioneer restocks their goods and runs another stage auction. In our model, we assume each round of auction to be independent, and the bidders as memoryless agents that only try to maximize their utilities in the current round. So extensive-form game equilibria are not being considered, i.e. the bidders do not analyze how their actions in the current round can influence their utility in future rounds. We denote such repeated auctions A. During the t-th round, each bidder i has a valuation function t v i : 2 M → R ≥0 , where t v i (S) denotes how much bidder i values the set of items S ⊆ M in round t. The valuation function t v i is drawn independently from a distribution F i over possible valuation functions V i , whereas F i is fixed in round t and is public to both bidders and auctioneers. To provide a realistic simplification of the bidders' input space, we can assume the bidders have additive valuation, which means v t i (S) = s∈S v t i (s). Therefore, we can use a matrix of size n × m to represent t v. Upon receiving their valuation function at round t, bidder i then reports their bids to the auctioneer. We will let t θ i ⊆ R m denote bidder i's bids at round t, and let t θ ⊆ R n×m denote the full bid profile at round t. We will let the set Θ contain all possible bids, and we say bidder i is truthful in round t iff t θ i = t v i . Prior to the first round of the repeated auction A, the auctioneer proposes a stage auction function as a tuple of an allocation function and a payment function A = (a, p) whereas a : R n×m → {0, 1} n×m and p : R n×m → R n . We say A = (a, p) is feasible if for any t θ ∈ Θ, ∀m ∈ M, n∈N a n,m ≤ 1, in other words, no item is allocated more than once. In round t, once the auctioneer receives t θ, they invoke some feasible stage auction function t A = ( t a, t p) to compute who gets what and how much they pay. We say the repeated auction A is truthful iff ∀t ∈ T, t A = A. The auctioneer's revenue in round t is then t rev = n i=1 t a i ( t θ) • t p i ( t θ), whereas t a i (x), t p i (x) represent the i-th row of t a(x), t p(x). Traditional auction design studies ask the auctioneer to publish the mechanism function prior to the auction, and follow it strictly. In the work by Akbarpour & Li (2020) , they considered the auctioneer as a utility-maximizing agent as well, which opened up the doors to the study of auction credibility. We take a similar approach in our model. We first ask the auctioneer to publish the mechanism, however, the auctioneer is free to deviate from this plan in strategic ways, which means it may be the case that the auctioneer can obtain higher revenue by running some A ′ ̸ = A in round t ∈ T , thus we define the regret of the auctioneer in round t as t aRgt = max t A [rev( t A, t θ)] -rev(A, t θ) such that t A is feasible. Similarly, the utility of bidder i is u i ( t A, t θ, t v) = t a i ( t θ)•[ t v i -t p i ( t θ)]. It may be the case that a bidder can obtain higher utility by misreporting, we formalize this by defining the regret of a bidder as t uRgt i = max t θi [u i ( t A, t θ, t v)] -u i ( t A, t v, t v) whereas t θ -i = t v -i . This is equivalent to searching for bidder i's optimal misreport assuming all other bidders are truthful in round t.

3.1.1. STATIONARY PARTICIPATION

Traditional auction design and differentiable economics studies usually define a stage auction for a fixed set bidders and a fixed set of items. Our model won't require each bidder to show up to every round of Â. Rather, we will use g to denote a participation profile, which is a binary string of length n that indicates which bidders showed up to the auction. The set G ⊆ {0, 1} n will contain all possible g i s and the set Q will contain q i s that entail the probability of g i at any round t ∈ [T ]. This implies the probability distribution of participation profiles is stationary, and we call this the stationary participation model. We will use fixed-bidder model to denote the special case where |G| = 1, G = {< 1, 1, 1, ..., 1 >}

3.2. CRYPTOGRAPHIC BACKGROUND

Commitments A commitment scheme is a cryptographic protocol that allows a user to commit to data by publishing the commitment without revealing the actual data. Given the data and a commitment, anyone can verify that the data has not been changed since the commitment was published. Commitment schemes are said to be binding-once committed, data cannot be changed even by the original owner-and hiding-the commitment on it's own does not reveal the data. A simple efficient commitment scheme consists of picking a 128 bit random number and hashing itfoot_2 together with the data to be committed to. Zero-Knowledge Proof A zero-knowledge proof allows a prover to convince one or more verifiers that some statement hold without revealing how or why. Goldreich et al. (1986) have shown that there exists a zero-knowledge proof for any NP-relation thus there exists a zero-knowledge proof for the correctness of auctions. More concretely, the last decade as seen marked advances practical zeroknowledge proof systems to the point where they can efficiently handle matrix multiplications and even neural networks. Liu et al. (2021) build a non-interactive zero-knowledge proof for predication in neural networks: given a model, the zero-knowledge proof shows that its output is correct for given inputs.

4. PROTOCOL

We define, in Algorithm 1, the protocol that we propose. For the sake of comparison, we also define (Appendix: A.1) a default auction protocol if companies like Google were to adopt recent works in differentiable economics like Dütting et al. (2019) ; Rahme et al. (2020) ; Peri et al. (2021) into their sponsored search auctions.

Algorithm 1 Proposed Repeated Auction Protocol

Bidder valuation distributions made public Auctioneer proposes A = (a, p) Definition 4.1. The verification function ver(A, t θ, t a ′ , t p ′ ) takes in a stage auction A = (a, p), the bids of the bidders t θ in round t, along with a hypothesis allocation matrix t a ′ ∈ R n×m and a hypothesis payment matrix Initialize T ∈ N Initialize logs L θ , L a , L p ∈ R T ×n×m Initialize penalty ∈ ({0} ∪ R + ) t ← 1 while t ≤ T do L θ [ t p ′ ∈ R n×m . It returns 1 if t a ′ = a( t θ), t p ′ = p( t θ), otherwise returns 0. The log in algorithm 1 can be implemented with commitments and access to a broadcast channel which ensures all parties see the committed bids when they are announced. Upon request, the log provides the bids and results of the auction in a specific round, which can be audited. We have abstracted the audit process as the verification function ver because it can take on various forms. The most straightforward way to accomplish ver is to bring in a trusted third party, possibly at a cost that the bidders and auctioneer pay together. This third party can simply take the bids in round r and run it with the proposed auction function. Other than incorporating a third-party, the bidders can ask the auctioneer to publish the result of the auction in round r, which the bidders can then verify with their commitments in round r. The downside of this approach is that the allocation and payment of a randomly selected round will be revealed, so a bidder can possibly learn the bidding strategy of another. To address this issue, the bidders can construct a zk-SNARK (Zero-Knowledge Succinct Non-Interactive Argument of Knowledge) Liu et al. (2021) out of the auction function, which would prevent the leakage of private information during verification. Next, we will show that the verification scheme along with the penalty are sufficient to make the auction credible while preserving desirable properties of the stage auction. From now on, when we write A(A), we refer to Algorithm 1 implemented with A as its proposed stage auction.

4.1. OBTAINING CREDIBILITY

Given the proposed stage auction function A, let aRgt * be an upper bound on t aRgt over all possible reported bids in A. This bound certainly exists if the bidders' valuation distributions are bounded, even if they are not bounded in the model, it would be reasonable to assume they are. In practice, this bound can be found by summing the highest market price of every item. Then the additional revenue that the auctioneer makes by being untruthful in w out of T rounds is bounded by w • aRgt * . Since in algorithm 1, the bidders are selecting a round to verify with a uniform distribution, the probability that the auctioneer is not caught deviating is 1 -w T . So with the penalty considered, the expected additional revenue in each untruthful round is 1 ω [(1 - w T )w • aRgt * - w T penalty] = aRgt * - w • aRgt * T - penalty T ≤ aRgt * - penalty T . Notice that the above inequality is independent of w, and when penalty > T • aRgt * , this expression is negative. So if we can estimate aRgt * , we can set a penalty so that the auctioneer makes negative expected additional revenue per untruthful round. This should prevent the auctioneer from deviating in any round. We include a strategy for the auctioneer to make additional revenue by being untruthful when the penalty is not high enough in appendix A.1. We have identified an approach to obtain truthful auctioneers using a verification scheme and penalty. We now turn to the problem of maximizing the auction's revenue, which is concerned with the bidders' behaviors. Since we have shown credibility can be obtained independent of bidders' behaviors, we can assume that the auctioneer will be truthful to their proposed auction from now on.

4.2. BIDDER BEHAVIOR

In practice, it is difficult to predict behaviors of bidders under a certain mechanism. However, using the concept of equilibrium and incentive compatibility, we can infer some behaviors of the bidders assuming rationality and perfect information. We provide the definition of incentive compatibility in two solution concepts below. The revelation principle states that for any auction mechanism A = (a, p) such that there exists a bidding equilibrium under A, there exists a mechanism A ′ = (a ′ , p ′ ) such that A ′ is incentive compatible (BNIC/DSIC depending on solution concept of the equilibrium) and achieves the same payoff profile as A in expectation. The adoption of incentive compatibility is commonplace in mechanism design studies due to this fact, which allows us to search in the smaller space of incentive compatible auctions when looking to maximize revenue. It's not hard to see that in both stage auctions and repeated auctions, DSIC implies BN IC. In fact, in the fixed-bidder model, the repeated auction A(A) will inherit incentive compatibility properties of the stage auction A. Lemma 4.4. In the fixed bidder model, A(A) is DSIC iff A is DSIC, similarily, A(A) is BNIC iff A is BNIC. Corollary 4.4.1. In the fixed bidder model, if A is revenue-maximizing, DSIC/BNIC. Then if we set penalty > T • aRgt * in Â, A(A) is credible, DSIC/BNIC, and revenue maximizing. The proofs of the above theorems are in appendix A.4.1. This implies to find the revenuemaximizing repeated auction A(A) in the fixed bidder model, we can just use the machine learningbased techniques proposed by Dütting et al. (2019) to optimize A. Now we transition to the stationary participation model, where the same set of items are being auctioned each round, but the bidders may change according to some stationary probability distribution. In this model, despite the auctioneer's uncertainty about which bidder will participate in any of the future rounds, they can still observe which bidder participates in the current round, as we can emulate a non-participating bidder by assuming their valuation for each item is 0. Therefore, the auctioneer can design a stage auction mechanism that depends on which bidder participates. Let G contain all participation profiles that happen with non-zero probability, and let t g ∈ G be the participation profile in the t-th round. The auctioneer will use an aggregated auction Â as their proposed stage auction whereas Â is defined by a mapping d( t θ) : Θ → G and a set agg( Â) that contains an auction A i for each g i ∈ G, which means |agg( Â)| = |G| ≤ 2 n . Specifically, the aggregated auction Â is defined by the following piecewise function: Â( t θ) =        A 1 ( t θ) if d( t θ) = g 1 A 2 ( t θ) if d( t θ) = g 2 ... ... A |G| ( t θ) if d( t θ) = g |G| whereas the mapping d can be accomplished by rounding all non-zero entries in t θ to 1, and then find the maximum bid of each bidder. This will result in a binary vector that must correspond to its matching g i ∈ G. It's not hard to see that in the stationary participation model, any stage auction function A can be written in the form of an aggregated auction Â which consists of a mapping from bids to a set of auctions. Therefore, we will say that the auctioneer selects an aggregated auction as their proposed auction function in the stationary participation model. Lemma 4.5. In the stationary participation model, if for any A i ∈ agg( Â), A i is DSIC/BNIC, then A( Â) is DSIC/BNIC. Theorem 4.6. In the stationary participation model, the repeated auction A( Â) is revenuemaximizing iff A i is revenue-maximizing for any A i ∈ agg( Â). The proofs of the above statements are in appendix A.4.1. From now on, we use Â * to refer to a revenue-maximizing instance of Â. Theorem 4.6 informs us that to find Â * , it suffices to find a set of auctions where each auction corresponds to a participation profile and is revenue-maximizing. we now discuss our approach to this task.

5. ESTIMATING Â *

We expect the size of agg( Â * ) to grow quickly with respect to the number of bidders. In particular, if the bidders' participation probabilities are independent from each other, the size of agg( Â * ) will grow exponentially. Considering the known complexity of finding a single revenue-maximizing multi-item auction, we can infer that the task of obtaining Â * is daunting. This complexity is somewhat relieved by recent works in differentiable economics, as if we can settle with an approximately optimal auction for each participation profile, each A * i ∈ agg( Â * ) can be approximated with machine learning. However, we would still need to perform auction training |G| number of times, which can be exponential. This section presents an approach to circumvent this complexity, and its performance is experimentally evaluated in the subsequent section.

5.1. TWEAKED DATASET

Our insight for estimating Â * is to generate a dataset according to G (participation profiles) and Q (probabilities of profiles) under the stationary participation model, we call this dataset the "tweaked dataset". Then we try to train an auction that performs well in the tweaked dataset in expectation. To put simply, we are estimating every element of agg( Â * ) at the same time. The tweaked dataset will contain a size K number of frames whereas each frame will be a matrix of size n × m that contains the valuations of each bidder for each object. To obtain each frame, we first sample a participation profile g ∈ G ⊆ {0, 1} n according to Q, then sample the untweaked valuations v ∈ R n×m , then element-wise multiply the two after broadcasting g across the items' dimension.

5.2. ARCHITECTURE

We adopt the additive neural network architecture from Rahme et al. (2020) , which consists of a multi-layer perceptron (MLP) allocation and payment network. Similar styles of mechanism neural network architectures are used in Duetting et al. (2019) ; Duan et al. (2022) ; Ivanov et al. (2022) . Since the auctioneer can choose not to allocate an item, and the optimal auction can take on the form of a lottery, the allocation network is implemented as two networks. The first one (f 1 : R n×m → [0, 1] m ) computes the probability that the auctioneer allocates each item; the second one (f 2 : R n×m → [0, 1] n×m ) computes the probability that an item will be allocated to each bidder if the auctioneer allocates that item. In Rahme et al. (2020) and our implementation, f 1 (θ) = σ(MLP(B)), f 2 (θ) = softmax(MLP(B)) (allowing for ghost bidders/items representing "no allocation"). The final allocation is then obtained with a i,j = [f 1 (B)] i,j • [f 2 (B)] i,j . The payment function is computed as p = σ(MLP(θ)), a ratio of the bidder's bid that they shall pay.

5.3. REGRET ESTIMATION

To compute the bidder's regret, we follow the approach proposed by Duetting et al. (2019) and use a misreport optimization loop to estimate the optimal "untruthful bid" of each bidder assuming other bidders are truthful. The misreport function is a MLP whose width and depth will be specified. The output of the misreport network is a n × m matrix of ratios between 0 and 1, which when element-wise multiplied by the valuations returns the matrix of misreports. This allows the individual rationality constraint to be built into the network. When testing the misreport module, we noticed that there does not seem to be a general optimal misreport network: one misreport network can perform excellently for one auction but horribly for another. Therefore (similar to the choice of Rahme et al. (2021b) ), we reinitialize the weights of the misreport network at each iteration of the auction training loop. We also found that the efficiency of the regret estimation step is greatly improved if we allow early stopping of the misreport, which means stopping the misreport optimization loop once the regret stops increasing for a certain number of rounds. Hyperparameters for early stopping will be specified in section 6.

5.4. TRAINING

The training loop consists of three steps: 1) computing the auctioneer's revenue and bidders' regret 2) computing loss and gradient 3) backpropagation. A nice property of thee loss function would be providing a comparison between two auctions with different revenue and regret. This is accomplished using Proposition 1 from Rahme et al. (2020) , attributed to Balcan et al. (2005) and Nisan. Lemma 5.1. (Rahme et al. (2020) ) Let rev, rgt denote the expected revenue and regret of some auction, then there must exist some other auction that achieves zero BNIC regret and revenue of ( √ rev -√ rgt) 2 under the same setting. The lemma above can be applied to DSIC auctions in 1 bidder, m item auctions. Whether it also holds for general n bidder, m item DSIC auctions is still an open problem Rahme et al. (2020) . Regardless, for any auction with non-zero regret, it provides a lower bound for the revenue of its zero-regret counterpart in BNIC solution concept. Nevertheless, Rahme et al. (2020) argues convincingly that it is a reasonable single metric to compare auctions which we can estimate the competency of any non-zero regret learned auction. We will adopt the above lemma as the loss function, which is stated below. loss = -( √ rev ′ -rgt ′ ) -rgt ′ . (The -rgt ′ term is for making the model slightly inclined towards auctions with low regret.)

6. EXPERIMENTS

We perform experiments to test the capability of our proposed training procedure to recover Â * . We first choose the n = 2, m = 2, bidder valuation uniformly distributed between [0, 1] case, because it is a classic test case for automated mechanism design Sandholm & Likhodedov (2015) ; Likhodedov & Sandholm (2005) 2020) and allows an in-depth evaluation of the results. In particular, we pick three scenarios to test on: 1. bidder 1 participates with 0.2 probability, bidder 2 participates with 0.8 probability, 2. bidder 1 participates with 0.5 probability, bidder 2 participates with 0.7 probability, 3. bidder 1 participates with 0.7 probability, bidder 2 participates with 0.9 probability. Since the participation probabilities of bidders are all independent, the participation profiles for each scenario are the same, namely g 1 =< 1, 1 > , g 2 =< 1, 0 >, g 3 =< 0, 1 >, g 4 =< 0, 0 >. However, the probabilities of each profile are different across scenarios and are specified below. • Scenario 1: q 1 = 0.16, q 2 = 0.04, q 3 = 0.64, q 4 = 0.16 • Scenario 2: q 1 = 0.35, q 2 = 0.15, q 3 = 0.35, q 4 = 0.15 • Scenario 3: q 1 = 0.63, q 2 = 0.07, q 3 = 0.27, q 4 = 0.03 All of the scenarios above are trained on their corresponding tweaked dataset with K = 20, 000, and the depth, width of the allocation MLP, payment MLP, and misreport MLP are all set to 7 and 100. The learning rate of the misreport optimizer is set to 1 × 10 -5 and it is looped for 300 times with early stopping point set to 100. The auction function optimization step is looped for 300 times with learning rate of 5 × 10 -4 for the first 100 iterations, 5 × 10 -5 for the second 100 iterations, and 5 × 10 -6 for the last 100 iterations. After training, we evaluate the learned auctions on a separate test set for both the aggregated auction and the individual auction performance. We first report the results on the three scenarios above when trained on tweaked datasets. Note that each scenario is trained with a designated neural network, and tested on that network. For comparison, we also train an auction function on an untweaked dataset, and test it on the test sets of the three scenarios above. 2 show that the networks trained on the tweaked dataset perform markedly better than the benchmark. In theory, Â * , the optimal auction under stationary participation profile is simply a "weighted sum" of its individual auctions. So we should expect the learned auctions from each of the three scenarios to perform decently in each individual participation profile. We evaluated this by testing the three neural networks trained on the tweaked datasets on the 2 × 2 and 1 × 2 scenarios additionally. Table 3 : Benchmark performance of learned aggregated auction on individual auctions We see that the learned auctions perform relatively well on the individual auctions: achieving a revenue close to the optimal in most cases while maintaining low regret. We see a trend that the performance on the individual auctions is dependent on the probability of the participation profile associated with that auction. For example, in scenario 1, the participation profile g 2 =< 1, 0 > happens with a low probability of 0.04, so for the first neural network, if it performs badly in g 2 it won't harm the aggregated auction performance as much as if it performs badly in g 3 =< 1, 1 >, which happens with probability 0.64. Therefore, although g 2 and g 3 are in theory the same auction, the first neural network performs better in g 3 . This trend can also be found in the other two scenarios. g 1 =< 1, 1 > g 2 =< 1, 0 > g 3 =< 0, 1 > NN rev ′ 1 rev * 1 rgt ′ 1 rev ′ 2 rev * 2 rgt ′ 2 rev ′ 3 rev * 3 rgt ′ 3 1 0.688 ≈ 0. A larger scale experiment with 3 bidders and 10 items is included in Appendix A.5.

7. CONCLUSION

In this paper, we demonstrate how to run credible, incentive compatible, privacy-preserving and revenue maximizing auctions in settings where auctions take place with high frequency. Our work is inspired by the impossibility theorem proposed by Akbarpour & Li (2020) , where a trilemma is established between credibibility, incentive compatibility, and privacy-preserving in stage auctions assuming no communication between bidders. Because cryptographic protocols are efficient these days, we relax the assumption of no bidder communication, and show that by implementing a verification scheme in a repeated auction, we can obtain credibility while maintaining incentive compatibility and bidders' privacy. We also propose a stationary bidder participation model, which to our knowledge is the first in the differentiable economics community. We provide a method for training revenue-maximizing auctions in the stationary participation model, whose theory and efficacy is tested with two experiments. We note that our method for training revenue-maximizing auctions in the stationary participation model can not only be applied in our repeated auctions protocol, but also any stage auction where participation of bidders is uncertain.

A APPENDIX

A.1 BENCHMARK ALGORITHM The following algorithm provides a default way to implement auctions learned with differentiable economics techniques Dütting et al. (2019) Note that the running time for our proposed auction protocol described in algorithm 1 and the default protocol algorithm 2 are both linear with respect to T assuming the auction, queries of the log, and the verification scheme ver take constant time.

A.2 SAMPLE UNTRUTHFUL STRATEGY FOR AUCTIONEER

We now illustrate a strategy for the auctioneer when the penalty is not set high enough in Algorithm 1. Recall that this means penalty ≤ T • aRgt * . Since the valuation distribution is fixed, then t θ shall be sampled from the set of all possible bids Θ ⊆ R n×m . Suppose there exists a set of misreports Θ ′ = {θ ′ | θ ′ ∈ Θ, aRgt(A ′ , θ ′ ) ≥ penalty/T for some feasible A ′ , Prob(θ ′ ) > 0}, then if the auctioneer deviates whenever t θ ∈ Θ ′ and be truthful otherwise, they obtain equal or higher revenue compared to the strategy of always being truthful.

A.3 REGRET AND REVENUE OF INDIVIDUAL AUCTIONS

Corollary A.0.1 (Corollary of Lemma 5.1 by Rahme et al. (2020) ). Let rev * be the optimal revenue for an additive auction with n bidders and m items. Then under the BNIC solution concept, for any other auction A ′ that achieves expected revenue rev ′ and expected mean regret rgt ′ , the following inequality must hold, rev * ≥ ( √ rev ′ -rgt ′ ) 2 . The above corollary allows us to find an upper bound for the revenue of an auction given its regret and the optimal revenue, which is useful in estimating the revenue of an individual auction within an aggregated auction. We will discuss this below. In section 5.3 and 5.4, we have transformed estimating Â * into learning a revenue-maximizing auction in a setting where bidders have a "tweaked" valuation distribution. By definition of the aggregated auction Â, if we obtain Â′ as a decent estimate of Â * , we should expect at least some A i ∈ agg( Â′ ) to be near revenue-maximizing as well. For example, if we are given n = 3, m = 2, all bidders have valuation uniformly distributed in [0, 1], and G = {< 1, 1, 1 >, < 1, 0, 0 >}, and we learn a competent auction Â′ for this stationary participation model, then we should expect Â′ to perform well in the fixed-bidder 1 bidder 2 item auction as well. Thankfully, this is something we can check because analytical results are known for special cases of combinatorial auction including a 1 bidder 2 item case Manelli & Vincent (2006) . 



A sponsored search auction is one where the website owner auctions different ad spots on the webpage when a certain keyword is searched. In a second price auction, the auctioneer allocates the item to the highest bidder, and charges them the bid of the second highest bidder. The best strategy for any bidder in the second price auction is to bid exactly how much they value the item. This holds for hash functions like SHA3.



, widely used by related papers in differentiable economics Duetting et al. (2019); Duan et al. (2022); Ivanov et al. (2022); Rahme et al. (2021a;b); Curry et al. (

87 8 × 10 -4 0.031 0.55 3 × 10 -5 0.563 0.55 7 × 10 -55 2 × 10 -4 0.535 0.55 2 × 10 -3

Definition 4.2. A stage auction A is Bayesian Nash Incentive Compatible (BNIC) if there is a Bayesian Nash equilibrium where the bidders report their true valuation. A repeated auction A is BNIC if given every other bidder choose to be truthful in every round, then bidder i achieves their optimal utility by bidding truthfully in every round. Definition 4.3. A stage auction A is Dominant Strategy Incentive Compatible (DSIC) if being truthful weakly dominates every other strategy regardless of other bidders' strategy. A repeated auction A is DSIC if no matter what every bidder does, bidder i achieves their optimal utility by bidding truthfully in every round.

Performance of three learned auctions in their corresponding scenario after training on the tweaked dataset

Performance of a neural network trained on untweaked dataset tested on the three scenarios

in a high-frequency auction market such as online ad auctions.

Performance of a neural network trained on untweaked dataset tested on the three scenarios Scenario rev ′

Benchmark performance of learned aggregated auction on individual auctions ( * implies estimated value).

annex

We now let Â′ denote the learned aggregated auction, and let rev ′ i , rgt ′ i denote the expected revenue and regret of A ′ i ∈ agg( Â′ ) which corresponds to the participation profile g i that happens with probability q i . We will let rev * i denote the revenue of the optimal auction for participation profile g i . Suppose the aggregated auction Â′ achieves revenue rev ′ agg and regret rgt ′ agg , then we know the following about the performance of individual auctions A ′ i s. Lemma A.1. The regret of the individual auctions can be bounded in the following way rgt i ≤ rgt super qi .Proof of Lemma A.1. The aggregated regret rgt agg is a weighted sum of the regret of each of its individual auctions rgt agg = i∈|G| q i • rgt i .Lemma A.2. The revenue of the individual auctions can be upper bounded in the following waywhereas rev * is the optimal revenue.Proof of Lemma A.2. From Corollary 5.1.1, we know, we obtain the claim.Thus if we know the optimal revenue of the individual auctions in Â, and we know the revenue and regret of the learned auction Â′ , we can find upper bounds of the revenue and regret of each individual learned auction. Since the aggregated revenue is a sum of the revenue of each individual auctions weighted by their corresponding probability, we can also obtain a lower bound for the revenue of the i-th auction by subtracting away the upper bound of every other auction.Theorem A.3. The revenue of A ′ i ∈ agg( Â′ )) is guaranteed in the following wayProof of Theorem A.3. We can plug lemma A.2 intoexpanding the right hand side leavesplugging it in leavesNote that theorem A.3 is a quite conservative bound, because lemma A.1 is a conservative bound for rgt ′ i , and theorem A.3 repeatedly applies it |G| -1 many times. Therefore as |G| increases, the bound in theorem A.3 will become loose pretty quickly. We also provide an alternative lower bound for rev ′ agg that is tighter than theorem A.3, the tradeoff is that this tighter bound contains a maximization problem.Proposition A.4. The revenue of A ′ i ∈ agg( Â′ ) is guaranteed in the following wayThe maximization problem inside this bound can make it seem complex, in fact, the objective function of the maximization problem is quite straightforward, thus the bound can be computed efficiently as well.A.4 PROOFS

A.4.1 PROOFS IN SECTION 4

Proof of Lemma 4.4. To see that A(A) is BNIC implies A is BNIC and A(A) is DSIC implies A is DSIC, observe that A is a special case of A(A) where T = 1. For the other direction, suppose A is BNIC and every bidder but i is truthful in each of the t rounds, then always being truthful weaklydominates every other possible strategy for bidder i. The same argument goes for DSIC.Proof of Lemma 4.5. Let u i,j be the expected utility of bidder i by bidding truthfully in A j ∈ agg( Â) regardless of other bidders' strategy. By the assumption that every A j ∈ agg( Â) is DSIC, u i,j must be the highest utility that bidder i can obtain in A j . Now suppose A( Â) reaches round t, which means the auctioneer will run some A j ∈ agg( Â) that corresponds to t g. Despite the bidder may not know what A i is, the maximum expected utility for bidder i in this round is u i,j , which is achieved by bidding truthfully. The proof for BNIC follows a similar argument where we relax the assumption to expect other bidders to be truthful.Proof of Theorem 4.6. The expected revenue per round of A( Â) can be computed asIt's then clear that the expected revenue per round of A( Â) can be improved iff the revenue of any A i ∈ agg( Â) can be improved.

A.5 ADDITIONAL EXPERIMENTS

We now turn to a slightly larger scale experiment: n = 3, m = 10, bidders valuations uniformly distributed between [0, 1]. We again pick three scenarios where the bidders' participation probabilities are independent, thus the three scenarios will share the same set of participation profiles, which are g 1 =< 1, 1, 1 >, g 1 =< 0, 1, 1 >, g 1 =< 1, 0, 1 >, g 1 =< 1, 1, 0 >, g 1 =< 1, 0, 0 > , g 1 =< 0, 1, 0 >, g 1 =< 0, 0, 1 >, g 8 =< 0, 0, 0 >. However, in each of the three scenarios, the participation profiles will take place with different probability. We perform the same experimental procedure as we did for the 2 × 2 experiment, except the allocation MLP, payment MLP, and misreport MLP are all expanded to have depth of 17 and width of 120. We also only loop the misreport module 100 times with no early stopping on each regret estimation step. The auction function optimization step is also looped for 600 times with learning rate of 5 × 10 -4 for the first 200 iterations, 5 × 10 -5 for the next 200 iterations, and 5 × 10 -6 for the last 200 iterations. 

