# Introduction to the tracking process

Cantag is one of a handful of fiducial tracking systems. A fiducial is simply a visual marker or tag, often use to uniquely label and object or space, similar to a barcode. Unlike a barcode, however, fiducials within a tracking system not only give identity, but also information about 3D position and orientation (relative to the viewing camera).

## Inferring identity

Fiducial systems have traditionally followed one of two methods to establish the unique identity of a marker:

• Symbolic codes: Here we put a unique bit pattern on the marker. Given a possible marker, the system can read the bits from the pattern and decide on the identity. This approach is attractive since it allows us to use established error-correcting codes and permits a more exact analysis of capabilities.
• Pattern Recognition: Here, instead of assigning a unique bit pattern, we assign a unique picture. When a potential marker is found in an image, the system uses the known shape of the tag without perspective (see the next section) to undo the perspective transform of the picture. Then establishing identity is a matching process: we compare every marker known to be deployed to the current picture and use a variety of metrics to find the best match and hence identify the marker. The down-side to such an approach is that the matching process is typically slow and it is very hard to analyse the performance analytically.

The present distribution of Cantag features support for symbolic coding on markers. However, it has been designed to make the addition of pattern matching (or other approaches to identity) easy to implement.

## Inferring Position and Pose

We infer the position and pose of a marker based on each marker featuring a known shape. For example, a square marker often features a solid square border around the identity bit pattern. When processing an image, we search it for shapes that could possibly be a valid perspective projection of the known shape. For example, a square projects to a general quadrilateral.

One we have identified a candidate for a marker, we undo the perspective projection based on the known size and shape of the same points in the reference frame of the marker. For example, in the frame of a square marker the four corner points may be at (-1,1), (1,1), (1,-1), and (-1,-1) and we look for a transform that would cause them to be projected to the corresponding points identified in the image. Once we have this, the implicitly know the position and pose of the marker in the reference frame of the camera. The marker can then be identified (and possibly rejected as noise, in the case that it does not carry a valid identity).