Representation for Robot Vision

Next: Other Robot Representations Up: Space and Shape Representation Previous: Object Relative Position Representation

Representation for Robot Vision

The greatest variety of shape representation techniques appear to be found in vision research literature. This is largely because the task is simpler than the complete problem of robot control, so systems can successfully employ a greater variety of techniques, without some of the constraints that are imposed by the need to control a physical robot. Greater variety also results from attempts to describe natural objects in outdoor scenes, which can be far more difficult to encompass with formal description techniques than the man-made objects that robots normally act upon.

Many visual representations are optimised for particular visual input facilities, or for recognising particular classes of objects. There is therefore much variation, even between systems that use the same overall approach. Rather than attempting a complete survey, the remainder of this section concentrates on those methods which are applicable to mechanical domains.

The earliest vision systems represented objects simply as a pixel map of the image corresponding to the object. A slightly more sophisticated representation can provide a three dimensional description of an object using ``silhouette'' bitmaps observed along three axes. The result describes a three dimensional enclosure for the object (excluding closed concavities), which can be stored efficiently. The technique is called rectangular parallelepiped coding [KA86].

The use of edge detection filter algorithms on visual data enabled objects to be represented as a collection of boundaries. Guzman identified different types of vertex that can be formed at edge junctions in three dimensional polyhedral objects (with Huffman and Clowes later providing a theoretical foundation for the classification [RJ88]), and this classification can be used to derive a three dimensional description of an object from the relationships between visible edges.

Lowe has developed a technique for identifying objects in terms of invariant groupings of edges which would have a known appearance when viewed from any angle [Low87]. Shape representation in terms of possible edge groupings only identifies features for shape recognition - it does not provide a full description of the shape. It is interesting in that it provides a consistent mapping from two dimensional to three dimensional representations.

A useful technique for describing three dimensional shape is the method of generalised cylinders, developed by Binford. A generalised cylinder is created by sweeping a two dimensional shape along an axis, with size of the swept cross-section varying according to a sweeping function. A plain cylinder, for example, is simply defined as a circle swept over a straight line, with a constant sweeping function. More complex definitions are easily achieved - a pyramid is a square swept over a straight axis with a linear sweeping function, decreasing toward the apex. Generalised cylinders were proposed as the output formalism for a large MIT vision project by Brady [Bra85a], following Marr's use of generalised cylinders as a high-level representation.

Description of shape by analysis of boundary features is provided by Brooks' ``Smoothed Local Symmetries'' representation, which separates the boundary of a two dimensional image into sub-parts, according to transitions on the boundary that look like joins between parts. Shapes described with this representation are used as the basis for shape generalisation from visual data by Connell [CB87] [Con85], and for a project by Brady in which function of mechanical tools is deduced from their shape [BA84a].

There are many shape representation techniques used in vision systems which can be applied to a more general range of shape than those discussed above. Other methods for representation of shape within a visual image include the use of surface patches of known curvature on three dimensional objects [FH86], the use of ``Gaussian curvature'' to represent bumps in a more general solid representation than the generalised cylinder [Bli87], and representation of a wide range of natural forms using three dimensional surfaces defined by superquadrics and fractals [Pen86a]. These techniques provide more generality, but are more complex than the above methods, which adequately describe mechanical objects.

Next: Other Robot Representations Up: Space and Shape Representation Previous: Object Relative Position Representation

Alan Blackwell
2000-11-17