FEINT IN MULTI-PLAYER GAMES

Abstract

This paper introduces the first formalization, implementation and quantitative evaluation of Feint in Multi-Player Games. Our work first formalizes Feint from the perspective of Multi-Player Games, in terms of the temporal, spatial and their collective impacts. The formalization is built upon Non-transitive Active Markov Game Model, where Feint can have a considerable amount of impacts. Then, our work considers practical implementation details of Feint in Multi-Player Games, under the state-of-the-art progress of multi-agent modeling to date (namely Multi-Agent Reinforcement Learning). Finally, our work quantitatively examines the effectiveness of our design, and the results show that our design of Feint can (1) greatly improve the reward gains from the game; (2) significantly improve the diversity of Multi-Player Games; and (3) only incur negligible overheads in terms of time consumption. We conclude that our design of Feint is effective and practical, to make Multi-Player Games more interesting.

1. INTRODUCTION

Game simulations, which only use Markov Game Model (Filar (1976) ) or its variants (Wampler et al. (2010) ; Kim et al. (2022) ), breed the needs for the diversity and the randomness to improve the game experiences. The trends of evolving more details into simulated games demand: ➊ the need for non-transitivity (i.e. there are no dominant gaming strategies), which allow players to dynamically change game strategies. In this way, the newly-incorporated strategies can maintain a high level of the diversity, which guarantee a high extent of unexploitability (Liu et al. (2021) ); and ➋ the strict requirements on temporal impacts (and its implications on spatial and collective impacts), since modern game simulations are highly time-sensitive ( Nota & Thomas (2020)). Therefore, new optimizations on these game models are expected to be elegant and easy-to-implement, to preserve the original spirits of these games. Our work first builds upon representative examples from the above two trends, by unifying two stateof-the-art progress of Multi-Player Games: ➊ we use Unified Behavioral and Response Diversity (described in Liu et al. ( 2021)), which exploits non-transitivity (i.e. no single dominant strategy in many complex games), to highlight the importance of the diversity in game policies. Moreover, we address the issue from their work, which fails to consider the intensity and future impacts from complex interactions among agents; and ➋ we incorporate Long-Term Behavior Learning (described in Kim et al. ( 2022)), which proposes Active Markov Game Model to emphasize the convoluted future impacts from complex interactions among agents. Based on the above two results, we unify them as a new model called Non-transitive Active Markov Game Model (NTAMGM), and use it throughout this work. This unification satisfies the need for a game model where (A) agents have intense and time-critical interactions; and (B) the design space of game policies is highly diverse. The definition of NTAMGM is described below. • Non-transitive Active Markov Game Model: We define a K-agent Non-transitive Active Markov Game Model as a tuple ⟨K, S, A, P, R, Θ, U ⟩: K = {1, ..., k} is the set of k agents; S is the state space; A = {A i } K i=1 is the set of action space for each agent, where there are no dominant actions; P performs state transitions of current state by agents' actions: P : S ×A 1 ×A 2 ×...×A K → P (S), where P (S) denotes the set of probability distribution over state space S; R = {R i } K i=1 is the set of reward functions for each agent; Θ = {Θ i } K i=1 is the set of policy parameters for each agent; and U = {U i } K i=1 is the set of policy update functions for each agent.

