Cantag/intro

Introduction to the tracking process

Cantag is one of a handful of fiducial tracking systems. A fiducial is simply a visual marker or tag, often use to uniquely label and object or space, similar to a barcode. Unlike a barcode, however, fiducials within a tracking system not only give identity, but also information about 3D position and orientation (relative to the viewing camera).

Inferring identity

Fiducial systems have traditionally followed one of two methods to establish the unique identity of a marker:

  • Symbolic codes: Here we put a unique bit pattern on the marker. Given a possible marker, the system can read the bits from the pattern and decide on the identity. This approach is attractive since it allows us to use established error-correcting codes and permits a more exact analysis of capabilities.
  • Pattern Recognition: Here, instead of assigning a unique bit pattern, we assign a unique picture. When a potential marker is found in an image, the system uses the known shape of the tag without perspective (see the next section) to undo the perspective transform of the picture. Then establishing identity is a matching process: we compare every marker known to be deployed to the current picture and use a variety of metrics to find the best match and hence identify the marker. The down-side to such an approach is that the matching process is typically slow and it is very hard to analyse the performance analytically.

The present distribution of Cantag features support for symbolic coding on markers. However, it has been designed to make the addition of pattern matching (or other approaches to identity) easy to implement.

Inferring Position and Pose

We infer the position and pose of a marker based on each marker featuring a known shape. For example, a square marker often features a solid square border around the identity bit pattern. When processing an image, we search it for shapes that could possibly be a valid perspective projection of the known shape. For example, a square projects to a general quadrilateral.

One we have identified a candidate for a marker, we undo the perspective projection based on the known size and shape of the same points in the reference frame of the marker. For example, in the frame of a square marker the four corner points may be at (-1,1), (1,1), (1,-1), and (-1,-1) and we look for a transform that would cause them to be projected to the corresponding points identified in the image. Once we have this, the implicitly know the position and pose of the marker in the reference frame of the camera. The marker can then be identified (and possibly rejected as noise, in the case that it does not carry a valid identity).

What's special about Cantag?

There have been many different fiducial tracking systems over the year, used in a variety of different areas. However, they have tended to have one thing in common: they impose a strict tracking process. You must use the (single) marker design that the system supports and the processing is opaque to the user. The result of this is that it is nearly impossible to compare two fiducial systems systems fairly. Questions such as "is pattern matching better than symbolic codes for accuracy" and "do circular markers give better tracking accuracy than squares" cannot be answered.

Cantag addresses this because it is a framework to construct fiducial systems rather than a single system. It has the notion of a pipeline - a sequence of tasks that must be carried out to get the tracking result. Typical tasks are thresholding, contour following, shape matching, transform extraction, code reading, etc. Users are free to sequence the tasks as they choose, substituting them for more accurate or faster algorithms that may either be included in Cantag or be custom-written.

This approach provides a huge level of flexibility and allows system designers to construct their ideal system, tweak pipeline sections to better suit their application or platform, and experiment to find different tradeoffs.