Department of Computer Science and Technology

Technical reports

Action selection methods using reinforcement learning

Mark Humphrys

June 1997, 195 pages

This technical report is based on a dissertation submitted by the author for the degree of Doctor of Philosophy to the University of Cambridge, Trinity Hall.

DOI: 10.48456/tr-426

Abstract

The Action Selection problem is the problem of run-time choice between conflicting and heterogenous goals, a central problem in the simulation of whole creatures (as opposed to the solution of isolated uninterrupted tasks). This thesis argues that Reinforcement Learning has been overlooked in the solution of the Action Selection problem. Considering a decentralised model of mind, with internal tension and competition between selfish behaviors, this thesis introduces an algorithm called “W-learning”, whereby different parts of the mind modify their behavior based on whether or not they are succeeding in getting the body to execute their actions. This thesis sets W-learning in context among the different ways of exploiting Reinforcement Learning numbers for the purposes of Action Selection. It is a ‘Minimize the Worst Unhappiness’ strategy. The different methods are tested and their strengths and weaknesses analysed in an artificial world.

Full text

PS (0.5 MB)

BibTeX record

@TechReport{UCAM-CL-TR-426,
  author =	 {Humphrys, Mark},
  title = 	 {{Action selection methods using reinforcement learning}},
  year = 	 1997,
  month = 	 jun,
  url = 	 {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-426.ps.gz},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-426},
  number = 	 {UCAM-CL-TR-426}
}