Augmented Reality

Introduction

Large-scale Augmented Reality

We perceive the real world at a much higher level of detail than we could define an artificial Virtual Environment. We would like to retain this level of detail, whilst augmenting it, where appropriate, with extra data obtained from a variety of sensor systems. Such context-sensitive visualisation of data could be useful in many tasks, ranging from a technician fixing a complex piece of equipment to a tourist locating objects in an art gallery.

Previous Augmented Reality (AR) systems inside buildings have been tethered or restricted to small volumes of space. We believe it is important to deploy throughout a large and populous area in order to examine the potential of mobile AR. Therefore we have chosen to allow the AR user to roam freely within an entire building. At AT&T Laboratories Cambridge we provide personnel with AR services using data from an ultrasonic tracking system, called the Bat system, which has been installed building-wide. The deployment process highlights practical issues such as cost and ease of installation. Furthermore, if large numbers of people are working in the augmented environment on a day-to-day basis, we are forced to consider the social integration aspects of the system.

Ubiquitous Computing

We also believe that a properly-designed AR system can be thought of as an instance of a Ubiquitous Computing; that is to say, it provides enhanced facilities throughout the environment, whilst being relatively unobtrusive to users. This goal could be realised by displaying information using either personal or environment displays. In this project we take the personal display approach, displaying augmentation information on appliances that are associated with a particular user. Our system is careful to avoid what we refer to as visual pollution of the environment, since it does not require targets, cameras or fixed panel displays. We also wish to avoid burdening the user with excessively heavy or cumbersome equipment which they have to carry around.

We have approached the challenge of implementing a wide-area, in-building AR system in two different ways. The first uses a head-mounted display connected to a laptop, which combines sparse position measurements from the Bat system with more frequent rotational information from an inertial tracker to render annotations and virtual objects that relate to or coexist with the real world. The second uses a PDA to provide a convenient portal with which the user can quickly view the augmented world. These systems can be used to annotate the world in a more-or-less seamless way, allowing a richer interaction with both real and virtual objects.

Location system and World model

To create an accurate model of the environment, we require detailed knowledge of the 3D positions and orientations of objects in the environment. Similarly, to provide users with an AR experience, it is necessary to be able to ascertain the 3D position and viewing direction of the user with high accuracy and low latency. The Active Bat system is used for these purposes.

A software architecture for Sentient Computing has also been implemented at AT&T, using the bats and other sensors to update a model of the real world.

The bat system is installed throughout our three floor, 100,000 cubic foot office, which has over 50 rooms. The system is continually used by all 50 staff, and tracks over 200 Bats. The Bats have a battery lifetime of 12 months. The ultrasonic receivers are mounted recessed in the centre of the ceiling tiles, with cables in the roof, which makes the tracking infrastructure extremely unobtrusive.

The world model currently contains 1900 software objects corresponding to personnel, telephones, computers, walls, windows, etc. in the real world. In this project, we aim to utilise the detailed data set inherent in the sentient system to provide users with a rich AR experience.

Head-mounted Display

Our first approach to providing users with an AR experience via a personal display is to project augmentation information onto an optical see-through head-mounted display (HMD) unit.

Hardware

Our HMD system consists of a 750 MHz IBM Thinkpad T21 equipped with a Lucent WaveLAN card to provide networking. Tracking is performed by an InterSense InterTrax inertial tracker, and by three Bats which are mounted onto a hard hat along with a Sony Glasstron head-mounted display, running at a resolution of 800x600 pixels. The laptop, rechargeable batteries and power supplies are mounted in a backpack with a single power cable, enabling the system to be docked anywhere in the building. The HMD system can run for approximately 3-4 hours before it needs to be recharged.

Tracking

Sensor information is sent to the laptop over the WaveLAN wireless link. The Bats attached to the HMD are used to obtain a least squares estimate of the position and orientation of the user's head. Each set of Bat readings yields a raw measurement of head position and orientation. Noise will cause these measurements to differ from the true head position and orientation, so we apply a filter to damp this.

The HMD software object takes the current estimate of head orientation and uses a non-linear filter to make a new estimate, based on the latest raw measurement. The effect of the filtering is to apply very small, heavily damped corrections to the estimate of orientation when head motion is slow. When head motion is faster the corrections are much larger, and when movement is very rapid the next estimate of orientation is immediately taken to be true. This filter is implemented using the technique of spherical linear interpolation (SLERP).

Sensor fusion

In practice the Bat location system provides only 2-3 measurements of the head position and orientation per second. By itself this update rate is insufficient to give a sense of immersivity. To work around this limitation we fuse the sensor data from the HMD software object with that from the inertial tracking unit, which can provide orientation information with a very high update rate (up to 100 updates per second).

The inertial tracker only provides orientation updates, but it is less crucial to provide frequent estimates of head position than orientation, as angular velocities result in much larger image velocities than those caused by translational velocities. The estimate of orientation provided by the inertial tracker is prone to drift, so we use the Bats to correct for this in the medium-to-long-term, and rely on the inertial tracker in the short periods of time between Bat readings.

Each time an estimate of the head orientation is made by the HMD software object the estimate is communicated using CORBA to a process running on the laptop. This estimate is compared with the most recent reading from the inertial tracker, and a filtered correction to the inertial tracker is then calculated.

Calibration

Having calculated the position and orientation of the head, it is necessary to transform these into the reference frame of the user's eyes. This involves a translation from the origin of the helmet reference frame to the user's eye, followed by a rotation.

A series of calibration screens ensure the user is wearing the HMD properly, and guides them through the procedure. The user clicks their Bat over a cross-hair in the centre of their field of view. The user is requested to keep their head level (no roll) so that both a view direction vector and up-vector can be determined.

The figure below shows a typical view through the HMD. In this case the system has labelled a user, a computer (hostname tamarillo) and a telephone (number 498). As the person (or other object whose position is monitored by the sentient system) moves, the label follows them in the user's view.

Person and object labels viewed through the HMD

Batportal

The Batportal is a lightweight alternative to the HMD using a hand-held PDA. This benefits portability and ease of use. The display is, of course, non-immersive and the tracking capabilities are less effective than the HMD system.

Hardware

The Batportal consists of a Compaq iPAQ running Linux, with a Lucent WaveLAN card and a Bat attached to the top of the device. The iPAQ has a 240x320 pixel colour touchscreen.

Principles of operation

In use, the device is held at arm's length, rather like a magnifying glass. The positions of the user's personal Bat and the Bat fixed to the handheld device are combined to form a direction vector in which the user is looking. Augmentation information can then be rendered based on the user's location and direction of view, using the information in the sentient system's model.

Software

The iPAQ is used as a thin client, with applications running on a back-end workstation and the iPAQ simply acting as an I/O device. The iPAQ's display is accessed remotely using X11 across the WaveLAN; we have also tested a VNC version which is less vulnerable to disruption caused by gaps in Wavelan coverage. The motivation behind accessing the display remotely is not due to any lack of CPU power, but is to make the endpoint stateless, reducing the effort required in maintenance.

Registration

The Batportal's screen differs from the HMD in two significant respects: firstly it is not transparent, and secondly the viewing frustum is very narrow, particularly at arm's length. The lack of transparency is not a problem because the device is small, so the user can easily see the real world context around the screen. To overcome the narrow viewing angle we use a false perspective, giving a ``fish-eye lens'' effect. Consequently, registration does not have to be precise, since objects are not seen as directly overlaid.

The viewing angle can be adjusted using the iPAQ's cursor keys. We have also implemented a mode in which the ``magnification'' can be continuously adjusted by holding the Batportal closer or further away from the eye.

Ownership

The Batportal is designed to be a tool which can be picked up and used immediately, with no configuration or inconvenient sign-on process. To signify temporary ownership of a Batportal, a user simply presses a button on the device. The device checks which person is closest to it (using information from the sentient system), and starts displaying the world from that person's point of view. The device could also adopt the new user's personal preferences at that point in time.

Audio

A general-purpose audio server runs on the Batportal. This is combined with the Festival text-to-speech engine, to provide speech output from the device. Festival runs continuously on the back-end machine, synthesizing utterances which are then sent to the Batportal via WaveLAN.

Speech is currently used to identify which room the user is in, and to provide feedback when the current user or mode changes. Status information that can be communicated aurally does not clutter the limited screen area, and makes the device friendlier in use.

Sound output could be a distraction in a busy office environment, so for audio-intensive applications we use a lightweight single-sided earphone. For example it should be possible for the Batportal to narrate the subject lines of e-mails or answer queries about the environment even when the user is walking down the corridor or sitting in a meeting.

Applications

Annotation

Our sentient computing system provides a model of the world which includes objects such as computers, furniture, phones and personnel. Not only do we know the physical locations of these objects, but we have access to other properties and state, such as which people are visitors and whether a phone is on or off the hook. We can use the AR system to augment the user's view of these objects with annotated labels. Colour coding is used give cues to the state of the object they are labelling.

Annotation can be applied to fixed points in space (such as a room) or to moving tagged targets such as other people. This is a useful way of checking someone's name, office and perhaps common interests and so on. The sentient computing platform allows this information to be shared by all users of the AR system, whether they are using the HMD, Batportal or a traditional interface on a PC.

Navigation

An interesting class of AR application involves navigation within buildings - these include finding one's way around, locating another person or following a personal augmented tour of the building.

We can use our AR systems, together with our sentient computing environment, to display the locations of people, walls, computers, telephones and other objects relative to the user. The level of augmentation can be varied to support the particular task that the user wishes to achieve. For example, suppose the Batportal system renders a 3D view of the current state of the building. Walls can be switched between opaque and transparent, giving the device an ``X-ray vision'' capability. The user can also choose to display the structure of the entire building or just the current floor.

Navigation is possible using various means such as virtual signposts, a 2D map, compass arrows or turning signals. Virtual marker objects can be created by pressing a trigger button on the user's Bat (in the HMD system) or iPAQ (in the Batportal system), or by utilising a mode in which a virtual marker is automatically placed every half-second to create a trail, showing the route taken through the building by the user.

Virtual buttons

The personal Bats worn by members of staff at AT&T Laboratories Cambridge have an easily detachable mount, which means they can be held and used as 3D pointing devices. We can then construct a 3D user interface that extends throughout the building, and which is analogous to a conventional 2D GUI driven by a mouse pointer. If a Bat is held up to a point in space that has some particular application-level significance, the command associated with that point in space can be invoked by the sentient computing system. An example might be a point near a scanner that, when ``clicked'' using a Bat, starts a scan and automatically forwards the resulting image to the user's mailbox.

Normally, these active points in space (known as virtual buttons) are physically labelled by a post-it note or poster. However, we can extend this interface within the personal space of the user of the HMD system by dispensing with the physical labels, and relying on the AR annotation of the physical point to indicate that a virtual button is present, and what that virtual button controls. This approach has the advantage of reducing the amount of visual clutter in the environment, and has proved to be practicable.

Scenarios

Environments in which we envisage AR systems being particularly useful include museums, trade shows, libraries, department stores, supermarkets and hospitals.

For example, in a supermarket a simple 2D map on a Batportal could assist with locating items and indicating routes, as well as highlighting special offers and items which have been purchased before. The screen could also be used to display prices, ingredients and recipes.

Consider a hypothetical museum example. Museums are attractive environments for AR systems, because the infrastructure only has to be installed once, after which it is unnecessary to physically label objects when exhibitions change. Meta data can be added directly to the virtual world in a way which is complementary to, but easier than, creating physical signs or guidebooks.

The AR systems described here are personal devices, and so do not interfere with other visitors' experiences, and can be customised to take account of each user's age, language, interests and preferences. For example, one could request that the history of each painting be displayed on approach, the titles of modern art be withheld and any African sculpture nearby be highlighted. Furthermore a personal guided tour could be created with a different emphasis from the standard order of presentation.

The system could behave quite differently for children on a school visit than for ordinary visitors. Functions would include drawing attention to objects or aspects which the teacher considers important, or monitoring a ``treasure hunt'' for particular items (say three pictures which contain a mermaid, on discovery of which the students are rewarded with pop-up information to complete a worksheet). The teacher can readily check if the objectives have been completed, and co-operation is also possible, since routes to interesting places can be transmitted to peer AR systems.

Summary

We have developed an Augmented Reality system around the AT&T sentient environment using two different types of endpoint: a head-mounted display and a handheld PDA. Both endpoints can be used whilst performing other activities, and the PDA at least is sufficiently lightweight and discreet to come close to meeting social acceptance criteria. We have developed prototype user interfaces, using a mixture of 2D and 3D graphics, speech and minimal (or automatic) input methods.

People


David Ingram		Joseph Newman

More information

A more detailed description of the project can be found in the paper "Augmented Reality in a Wide Area Sentient Environment" by Joseph Newman, David Ingram and Andy Hopper.

This paper appeared in Proceedings of the 2nd IEEE and ACM International Symposium on Augmented Reality (ISAR 2001), October 2001, New York.