THE GAME OF HIDDEN RULES: A NEW CHALLENGE FOR MACHINE LEARNING Anonymous authors Paper under double-blind review

Abstract

Systematic examination of learning tasks remains an important but understudied area of machine learning (ML) research. To date, most ML research has focused on measuring performance on new tasks or surpassing state of the art performance on existing tasks. These efforts are vital but do not explain why some tasks are more difficult than others. Understanding how task characteristics affect difficulty is critical to formalizing ML's strengths and limitations; a rigorous assessment of which types of tasks are well-suited to a specific algorithm and, conversely, which algorithms are well-suited to a specific task would mark an important step forward for the field. To assist researchers in this effort, we introduce a novel learning environment designed to study how task characteristics affect measured difficulty for the learner. This tool frames learning tasks as a "board-clearing game," which we call the Game of Hidden Rules (GOHR). In each instance of the game, the researcher encodes a specific rule, unknown to the learner, that determines which moves are allowed at each state of the game. The learner must infer the rule through play. We detail the game's expressive rule syntax and show how it gives researchers granular control over learning tasks. We present example rules, an example ML algorithm, and methods to assess algorithm performance. Separately, we provide additional benchmark rules, a public leaderboard for performance on these rules, and documentation for installing and using the GOHR environment.

1. INTRODUCTION

Learning computational representations of rules has been one of the main objectives of the field of machine learning (ML) since its inception. In contrast to pattern recognition and classification (the other main domains of ML), rule learning is concerned with identifying a policy or computational representation of the hidden process by which data has been generated. These sorts of learning tasks have been common in applications of ML to real world settings such as biological research (Khatib et al., 2011 ), imitation learning (Hussein et al., 2018) , and game play (Mnih et al., 2015; Silver et al., 2018) . Since this process involves sequential experimentation with the system, much of the recent work exploring rule learning has focused on using reinforcement learning (RL) for learning rules as optimal policies of Markov decision processes. An important question is whether some characteristics make particular rules easier or harder to learn by a specific algorithm (or in general). To date, this has been a difficult question to answer, since many rules of interest in the real world are multifaceted and not well characterized. For instance, while there are effective RL algorithms that can play backgammon, chess, and go, these games differ in significant ways and it is not clear how much each structural variation contributes to differences in overall difficulty for the learner. In order to investigate these questions, new ways of generating rules and data must be devised that allow for researchers to examine these characteristics in a controlled environment. In this paper, we propose a new data environment called the Game of Hidden Rules (GOHR), which aims to help researchers in this endeavor. The main component of the environment is a game played in a 6 × 6 board with game pieces of different shapes and colors. The task of the learner is to clear the board in each round by moving the game pieces to "buckets" at the corners of the board according to a hidden rule, known to the researcher but not to the learner. Our environment allows researchers to express a hidden rule using a rich syntax that can map to many current tasks of interest

