NO-REGRET LEARNING IN REPEATED FIRST-PRICE AUCTIONS WITH BUDGET CONSTRAINTS Anonymous

Abstract

Recently the online advertising market has exhibited a gradual shift from secondprice auctions to first-price auctions. Although there has been a line of works concerning online bidding strategies in first-price auctions, it still remains open how to handle budget constraints in the problem. In the present paper, we initiate the study for a buyer with a budget to learn her online bidding strategies in repeated first-price auctions. We propose an RL-based bidding algorithm against the optimal non-anticipating strategy under stationary competition. Our algorithm obtains O( √ T )-regret if the bids are all revealed at the end of each round, where O(•) is a variant of the big-O that hides logarithmic factors. With the restriction that the buyer only sees the winning bid after each round, our modified algorithm obtains O(T 7 12 )regret by techniques developed from survival analysis. Our analysis extends to the more general scenario where the buyer can have any bounded instantaneous utility function with regrets of the same order. Simulation experiments show that the constant factor inside the regret bound is rather small.

1. INTRODUCTION

There has been extensive growth in the online advertising market in recent years. It was estimated that the volume of online advertising worldwide would reach 500 billion dollars in 2022 (Statista, 2021) . In such a market, advertising platforms use auctions to allocate ad opportunities. Typically, each advertiser has a limited amount of capital for an advertisement campaign. Therefore, consecutive rounds of competition are interconnected by budgets of participating advertisers. Furthermore, each advertiser has very limited knowledge of 1) her valuation of certain keywords and 2) the competitors she is facing. There are many works being devoted to studying algorithms for learning strategies for optimally spending the budget in repeated second-price auctions (see Section 1.1). In practice, on the other hand, we have witnessed numerous switches from second-price auctions to first-price auctions in the online advertising market. A recent remarkable example is Google AdSenses' integrated move at the end of 2021 (LLC, 2021) . Earlier examples also include AppNexus, Index Exchange, and OpenX (Sluis, 2017) . This industry-wide shift is due to various factors including a fairer transactional process and increased transparency. Therefore, the shift to first-price auctions brings about major importance to the following open question which is barely considered in previous works: How should budget-constrained advertisers learn to compete in repeated first-price auctions? This paper thus initiates the study of learning to bid with budget constraints in repeated first-price auctions. It has been noted that the application of first-price auctions with budgets is not limited to online advertising mentioned above. Traditional competitive environments like mussel trade in Netherlands (van Schaik et al., 2001) , modern price competition, and procurement auctions (e.g. U.S. Treasury Securities auction (Chari and Weber, 1992) ) are examples as well.

Challenges and contributions

The challenges in this setting are two-fold. The first challenge relates to the specific information structure of first-price auctions. In practice, it is often the case that only the highest bid is revealed to all participants (Esponda, 2008) . This is known as censored-feedback or an informational version of winner's curse in literature (Capen et al., 1971) .

