ELICITATION INFERENCE OPTIMIZATION FOR MULTI-PRINCIPAL-AGENT ALIGNMENT

Abstract

In multi-principal-agent alignment scenarios including governance, markets, conflict resolution, and AI decision-making, it is infeasible to elicit every principal's view on all perspectives relevant to an agent's decisions. Elicitation inference optimization (EIO) aims to minimize the n elicitations needed to approximate N principal's views across K perspectives. In this work, we demonstrate an EIO approach where data efficiency (N K/n) increases with scale. We introduce STUMP: an elicitation inference model which integrates a large language model with a latent factor model to enable learning transfer across samples, contexts, and languages. We characterize STUMP's performance on a set of elicitation primitives from which scalable elicitation (sampling) protocols can be constructed. Building from these results, we design and demonstrate two elicitation protocols for STUMP where, surprisingly, data efficiency scales like O(n) in the number of elicitations n. In other words, the number of elicitations needed per principal remains constant even as the number of perspectives and principals grows. This makes it possible to approximate complex, high-dimensional preference signals spanning principal populations at scale.

1. INTRODUCTION

The principal-agent problem involves aligning agent decisions with principal interests. Elicitation inference optimization (EIO) aims to decrease the amount of direct elicitation needed to recover a preference signal (enabling the use of more complex, higher-dimension preferences; e.g. in natural language). Consider an N × K matrix Θ where rows correspond to N principals, columns correspond to K perspectives, and every element captures a principal-perspective relationship. The goal of elicitation inference optimization is to obtain a sufficient approximation of Θ with a minimal elicitation budget by directly sampling some elements and inferring the rest. Thus, EIO involves combining a) a sparse elicitation (sampling) protocol with, b) an elicitation inference model. Closed-ended surveys simplify EIO by constraining the set of relevant perspectives to a predefined set -typically, with K << N . Matrix sampling techniques elicit responses from each participant on a subset of perspectives selected randomly Shoemaker (1973) , heuristically Raghunathan and Grizzle (1995) , or dynamically such that inference accuracy is adaptively optimized



A challenge is creating situations where agent choices are sufficiently influenced by signals containing principal preferences. With a single principal, high-complexity preference signals can be elicited directly via open-ended interaction. Multi-principal-agent scenarios can involve large populations of principals and powerful agents such as: governments & citizens Giger and Lefkofridi (2014); Gabriel (2020), firms & customers Roberts and Grover (2012), peacekeepers & conflict parties United Nations (2012), existing AI systems & impacted populations Prabhakaran et al. (2022), and potentially even transformative AI & humanity Russell et al. (2015); Christiano et al. (2017). As the number of principals grows, and the domain of agent decisions becomes open-ended, directly eliciting the preference of all principals on all relevant perspectives becomes unfeasible[A.1]. As a result, lowercomplexity forms of elicitation like ballot voting (ie. for governments) and price signals (ie. for firms) are used to learn preferences. While clearly effective-a basis of democracy and the economy-these approaches drastically simplify real preferences. For example, they do not allow citizens (principals) to express what they would like a government (agent) to do, or why, only if they support predefined options.

