Department of Computer Science and Technology

Technical reports

Learning in large state spaces with an application to biped robot walking

Thomas Ulrich Vogel

December 1991, 204 pages

This technical report is based on a dissertation submitted November 1991 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Wolfson College.

DOI: 10.48456/tr-241


Autonomous robots must be able to operate in complex, obstacle cluttered environments. To do this the robots must be able to focus on the important aspects of their environment, create basic strategies to carry out their operations, generalise these strategies and finally learn from successful experiences.

Based on simulated dynamic biped robot walking, this thesis investigates these issues. An algorithm is given which analyses the state space of the robot and orders the dimensions of the state space by their importance relative to the task of the robot. Using this analysis of its state space, the robot is able to generate a set of macros (gaits) which enable it to operate in its immediate environment. We then present a control algorithm which allows the robot to control the execution of its gaits

Once the robot has learned to walk on an obstacle-free horizontal surface, it uses its knowledge about gaits in order to derive obstacle crossing gaits from existing gaits. A strategy based on the qualitative equivalence between two behaviours is introduces in order to derive new behavioural patterns from previous ones. This enables the robot to reason about its actions at a higher level of abstraction. This facilitates the transfer and adaptation of existing knowledge to new situations. As a result, the robot is able to derive stepping over an obstacle from stepping on a horizontal surface.

Finally, the robot analyses its successful obstacle crossings in order to generate a generic obstacle crossing strategy. The concept of a virtual evaluation function is introduced in order to describe how the robot has to change its search strategy in order to search successfully for obstacle crossing behaviours. This is done by comparing how the successful obstacle crossing of the robot differs from its normal behaviour. By analysing and operationalising these differences, the robot acquires the capability to overcome previously unencountered obstacles. The robot’s obstacle crossing capabilities are demonstrated by letting the robot walk across randomly generated obstacle combinations

Full text

PDF (13.9 MB)

BibTeX record

  author =	 {Vogel, Thomas Ulrich},
  title = 	 {{Learning in large state spaces with an application to
         	   biped robot walking}},
  year = 	 1991,
  month = 	 dec,
  url = 	 {},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-241},
  number = 	 {UCAM-CL-TR-241}