The Origami Project

Paper Interfaces to the World-Wide Web

Peter Robinson, Dan Sheppard & Richard Watts
University of Cambridge Computer Laboratory
Cambridge CB2 3QG England

Robert Harding & Steve Lay
University of Cambridge Department of Applied Mathematics and Theoretical Physics
Cambridge CB3 9EW England

Abstract: This paper reports on ways of using digitised video from television cameras in user interfaces for computer systems. The DigitalDesk is built around an ordinary physical desk and can be used as such, but it has extra capabilities. A video camera mounted above the desk, pointing down at the work surface, is used to detect where the user is pointing and to read documents that are placed on the desk. A computer-driven projector is also mounted above the desk, allowing the system to project electronic objects onto the work surface and onto real paper documents.

This paper describes a particular application in which the system is used to provide access to the World-Wide Web. A WWW page can be printed on paper and then placed on the digital desk and animated: when a link is selected with a pen, the corresponding link in the original HTML document is followed and the resulting page projected onto the desk.


Recent developments in electronic publishing have shown the value of hypertext both for documents on CD-ROM and also for on-line presentation through the World-Wide Web. Computers endow electronic documents with powerful new facilities, leading some to believe that electronic media will soon replace conventional media completely. The trouble is that people like paper. It's portable, tactile and easier to read than a screen; in fact, computers now generate far more paper than they replace.

At the same time, developments in computer hardware have greatly reduced the cost of attaching television cameras to computers. They have moved from being an expensive peripheral for specialists to a price comparable with a monitor; further developments in technology will soon make the cost similar to that of a mouse. This raises the question of what new techniques will be appropriate when every computer routinely includes video input, possibly from several cameras.

Over the past few years, the University of Cambridge Computer Laboratory and the Rank Xerox Research Centre in Cambridge (formerly EuroPARC) have collaborated on research into the use of video in user interfaces [Robinson 1995, Stafford-Fraser 1996, Stafford-Fraser & Robinson 1996, Wellner 1993, Wellner 1994]. Computers 'watch' users at work and infer commands from gestures involving pens and paper. This is not virtual reality where the user is immersed in a totally synthetic, computer-generated environment, often donning a special headset and even clothes; this is augmented reality where the computers operate through everyday objects in the real world, enhancing them with computational properties.

Such a system requires the computer to monitor activities and to deliver its contribution as conventionally as possible, suggesting the use of video and, to a lesser extent, sound for input and output. Of course, this merely reflects normal practice. We are used to pointing to interesting parts of documents and commenting on them; electronic enhancements should operate in the same way.

At the same time, electronic, multi-media publishing has emerged as an alternative to conventional publishing on paper. The World-Wide Web and versions of reference books and fiction published on CD-ROM can enhance their conventional counterparts in a number of ways:

However, screen-based documents have a number of disadvantages:

We have been investigating ways of resolving these difficulties by publishing material as ordinary, printed documents that can be read in the normal way, enjoying the usual benefits of readability, accessibility and portability. However, when observed by a camera connected to a computer, they acquire the properties of electronic documents, blurring the distinction between the two modes of operation and giving a richer presentation than that afforded by either separate medium.

Our initial experiments have applied this technology to computer-assisted learning [New Technology for Interactive Computer Aided Learning]. Earlier work with Computer Illustrated Texts [Harding & Quinney 1990] supplemented printed books with software that was an integral part of the educational package but which had to be run separately. The two parts can now be united and a number of applications have been investigated. Separate papers discuss the presentation of mixed-media documents [Animated Paper Documents] and the internal architecture of our system [A Framework for Interacting with Paper].

This paper describes a particular application in which the system is used to provide access to the World-Wide Web [Berners-Lee et al. 1994]. A web page can be printed on paper and then placed on the digital desk and animated: when a link is selected with a pen, the corresponding link in the original HTML document is followed and the resulting page projected onto the desk. Sections of the documents can be captured in electronic form, edited, printed and animated in the same way.


The overall architecture of the animated paper document system is shown below [Fig. 1]. The system is written in Modula-3 [Nelson 1991], a high-level systems programming language whose object model has been extended to operate in a distributed environment [Birrell et al. 1993]. The principal components are as follows:

The Registry

At the core of the system is a Registry which maintains the association between electronic documents and their printed variants. It stores the image of each active document and the code of any interactions required for the document, together with cross references between these and further indexes to identify them. In the context of WWW documents, these correspond to links to other URLs, but the facilities allow much more general forms of interaction.

In the current implementation, the code implementing the interactions has to be linked in to the system. However, this is just a temporary measure. A better long term solution would be to store complete programs as Java applets [Arnold & Gosling 1996] or Obliq oblets [Brown & Najork 1996] which are more amenable to dynamic loading for remote execution. This would also simplify the handling of Java embedded in documents handled by the system.

Figure 1: Animated paper document framework.

The registry is accessed via a set of adaptors that allow the database to be built and edited, imported and exported to other forms of hypertext, and for documents to be printed and animated on a DigitalDesk. The following are relevant for processing WWW documents.


Conventional hypertext can be absorbed into the animated paper document system; paper access to the World-Wide Web is possible through such an adaptor. Given a uniform resource locator (URL), the adaptor captures the information currently on the associated Web page in the registry. This includes the URLs of any links embedded within the page.

An HTML parser breaks the document into blocks of text (usually paragraphs, but at a finer grain where there are links) and images. These are then rendered as PostScript and the positions and content of the links recorded. All this information is kept in the registry. The page can then be printed simply from the PostScript, with further embellishment to assist subsequent page recognition.


Documents in the registry can be edited with a fairly conventional WYSIWYG editor. Text and diagrams are entered and amended in the usual way. However, it is also possible to mark areas of the document as hyperlinks and associating interactors with them. These are recorded as references to the associated code.

One version of the editor actually operates on the DigitalDesk, which means that text, diagrams and interactors from other printed documents can be copied into the new document. If the other printed documents are active documents known to the system, this copying is entirely digital, just as it would be in a conventional word processor. However, text and pictures can also be copied from conventional printed documents by using the overhead camera to capture an image and passing any text through an optical character recognition system.


Another adaptor prints out documents from the registry onto paper so that they can be used for direct interaction on the DigitalDesk. The printed documents are annotated with marks in their corners to facilitate recognition and location on the desk top, and are also have a unique identifier printed in an OCR fount.

Once the document has been printed, its contents are retained in the registry as an immutable copy of its structure for future interaction. This allows the paper to continue working in the same way even if its electronic original is edited. However, any URLs referred to in the electronic version will have been remembered and will be followed when the paper version is animated. The contents of the pages identified by such URLs can change in the usual way.


The DigitalDesk actually animates the paper documents. This involves recognising that a page printed by the system has appeared on the desk, determining its position, reading its unique identifier and locating any interactors. A transformation is then set up between the page representation stored in the registry and physical co-ordinates on the desk top. The printed document thus becomes part of the projected window system. In particular, any active links are highlighted by projecting a red background over them. For a document originating on the Web, these correspond to links in the original HTML.

A pen with a light-emitting diode in its tip is used for pointing. This is recognised by the camera system and converted to co-ordinates using a transformation calculated by occasional registration. It would be possible to use a conventional graphics tablet, but the light pen has the advantage that it works perfectly well over a stack of paper on the desk. The events are passed back through the window system and interpreted using information in the registry. For a URL, this involves opening a new projected window on the desktop and displaying the contents of the associated page in it. The Modula-3 window system, Trestle [Manasse & Nelson 1991], and its user-interface toolkit, FormsVBT [Brown & Meehan 1993], include a window primitive that acts as a WWW browser, so this is straightforward.


The interactions afforded by animated paper are considerably richer than straightforward HTML but if a document is sufficiently simple, it can also be exported as HTML.

This involves scanning the image of the page from top to bottom, left to right and emitting text or images as appropriate. When a page is published in this way, a series of HTTP PUT commands are sent to the WWW server which is going to hold the page. One of these is for the HTML of the page itself, the others are for the richer features of DigitalDesk documents that can not be translated into conventional HTML. These can be recovered either through another DigitalDesk or by a suitably extended WWW browser.


The pictures below [Fig. 2] show the system in use as a World-Wide Web page passes through the stages just described:

(a) The Computer Laboratory's WWW home page is displayed by a conventional browser.
(b) This is imported into the animated paper document system's registry and printed on paper with extra decorations to assist recognition.
(c) When this is placed on the DigitalDesk it is recognised and active areas of the document are illuminated by projected highlights. One of the links has been followed and the contents of the associated URL are being projected onto the desk through a browser running in a separate window.
(a) Original Web page. (b) Printed version. (c) Animated on the DigitalDesk.
(d) Deriving a new document. (e) Printed version. (f) Animating the new document.

Figure 2: Paper access to the World-Wide Web.

(d) The editor is invoked and sections are copied from the paper document into a new electronic document projected onto the desk. This uses conventional copy-and-paste but works from a paper document to an electronic one. Existing links can be copied and new links added.
(e) The new document is printed with the standard decorations.
(f) The new paper document can also be animated on the DigitalDesk and used to activate WWW links.

This example shows how a conventional paper document can be used as the key providing access to the full range of electronic multi-media on the World-Wide Web.


In this paper we have described the use of animated paper documents to provide paper interfaces to the World-Wide Web. This combines the power of electronic hypertext and the convenience of printed documents.

Electronic publishing is a rapidly growing area with tens of thousands of titles in print on CD-ROM and hundreds of new titles being published each week. Direct publication exclusively in electronic form on the Internet is also growing. However, the problems of screen-based publishing - poor readability, limited view, slow access, inability to add personal annotations and so on - have limited its use to specialised applications. We believe that computer additions to printed texts offer a more promising approach, especially when delivered over a communications network.

Current work on animated paper documents is investigating both the underlying technology of the DigitalDesk and also new applications of mixed-media publication for educational material and more general use. We are particularly interested in using printed documents as the key to the delivery of electronic documents via network computers.


The original DigitalDesk was built by Pierre Wellner, a research student in the Computer Laboratory at the University of Cambridge sponsored by Rank Xerox. Current work on animated paper documents is sponsored by the EPSRC under grant GR/J65969.

Home | Background | Publications | References | Places of Interest | Resources