Description of Proposed Research

Abstract

A goal of the project is to understand the best forms of programming constructs and semantics for a programming system that enables unskilled users to configure, adapt, personalise and download new code into a home control system. The home control system will typically be the controller for a Home Area Network, but might initially be a set-top box or an intelligent audio-visual or home theater system. The users themselves will not necessarily ever write or see the programming language. Instead configuration and customerisation will be with spoken or keyboard input using natural language optionally combined with gesturing (using an infrared wand), or via keystrokes on normal IR handsets, downloading from remote servers and remote maintenance from central-office technician at a call centre. Another goal of the project is to produce a robust, working implementation that has been tried in real homes and which can be offered to industry. The results will also be applicable in other scenarios outside the home, such as schools or offices. The primary result will be the definition of a standardisable Event Engine that stores, implements and executes the system.

Scientific/Technological Relevance

Scientific relevance

A scientific aim is to understand more-fully the semantics of languages which combine logical, imperative and functional constructs together with event algebras. Many languages today combine imperative and functional programing in a natural way, but integration with logical constructs has normally simply caused imperative sequences to be triggered as a side effect of logical resolution or event occurance. An exception perhaps is Escher [1]. A second scientific aim is to investigate the mapping from gestural and spoken or textual input in natural language to the underlying event control language. Current natural language processing (NLP) systems, such as the Alvey Natural Language Toolkit (ANLT), support generic mappings from NL to a higher-order logical representation, incorporating quantification, intensionality, and so forth. For this application, a predominantly first-order approximation to the full semantics should prove adequate for the underlying semantics of the event language (considerably simplifying the inference processing component). However, this mapping should also support monologues and simple (system-driven (dis)confirmation) dialogues involving elliptical, anaphoric and deictic (predominantly gesture-based) input to facilitate ease of use by non-specialists. Thus the application provides a useful testbed for the portability and functionality of current generic NLP tools deployed in a multimodal context, and potentially a speech and gesture based NL interface would substantially enhance the usability of the system.

Technological relevance

Although the generation of people who are baffled by the exercise of programming their video recorder timer is receeding, new technology is always arriving in the home (and workplace). New technology offers new potential that is not realised owing to the effort required to become familiar with it. A vision where every device is networked creates new potential from existing technology. Homogeneous user interfaces to devices help overcome the obstacles, and in this project, the user interface is decoupled from the equipment, thus facilitating consistency. In an ideal implementation, a user may experience the same, personalised style of interface (or set of styles) for everything he encounters, at home, at work, at a friend's home or at a payphone. Another key aspect will be to discover the extent to which self learning systems are attractive to users. Can they cope with systems which change on their own ?

Beneficiaries, Collaborators, Dissemination and Exploitation

The intention behind the development of the Event Engine is to define and set the defacto standard for event-based control of systems. Contacts already exist with major consumer and telecom companies, such as Sony, Philips, and BT and input from these companies will alter the development. Prototype systems may be set up on their premises under license from the University. Finally, it is envisaged that a consortium of interested parties will be established, with the University of Cambridge being a significant shareholder, to promolgate, license and support the developed technology. In 1993, Dr Greaves and Dr.~Hermann Hauser (a local business man of some note), established `HAN Consortium' to do this, but it was disolved after the second meeting owing to it being too far advanced for most companies at that time. The scientific outputs of the project will be placed in the public domain using standard academic mechanisms, including conference presentations, journal papers, anonymous ftp, and the World Wide Web.

Project Programme

Background

In the past and elsewhere there have been several languages defined for control of home devices. To date, these have each suffered from a subset of the following shortcomings: The important home control systems today are Echelon's Lonworks [2], Tandy's X.10 system, the ESPRIT European Home Systems project EHS [3] and CEBus from the CEBus Industry Council [4]. These have been designed for low data rates over mains carrier signalling. At higher data rates, ATM at 51 Mbps, IEEE 1394 (Firewire) at 200 Mbps and Ethernet at 10 and 1.3 Mbps are important contenders for future home networks. Although, in our own work, we are most interested in the ATM solution, the work proposed in the next section is at layers above the physical level and will encompasses all of the above mentioned technologies. Many current home area networks do not have a user programming lanugage. Instead, user customerisation is achieved using a seemingly infinte number of miniature (DIP) switches on each devices. However, it is cheaper and easier to produce devices without such switches, using computer technology for control and customerisation. The most advanced programming system currently in commercial development is being done by CEBus Industry Council working groups and is focussed on the development of Common Application Language (CAL) [5]. Microsoft are also doing something, but no information is to hand. CAL consists of a simple command language based around transmission of single byte or very short ASCII strings. Some of the commands represent constructs such as while and if, as found in imperative scripting languages, and so, like a much simplified implementation of Java, `programs' can be downloaded for remote execution. No work has been done on the automatic generation of these programs. Instead, a standard set of programs, forming the Home PnP specifcation [4] exists, that is intended to be suitable for most applications. Using these concepts, the CEBus Industry Council has set out a number of useful home context descriptions, including `audio amplifier', `tuner' and `TV' in its Home PnP Specification [4]. Put simply, CAL defines the language which enables CEBus devices to inercommunicate; Home PnP adds the necessary semantics for interoperability between them. In summary, although a diverse set of networks and network control systems has been and continue to be developed for the home, these have been fairly primitive and unimaginative. We believe there is scope for original research in home control, concentrating on customerisation and ease of use, and this will be our proposal below.

Details of Proposed Research

The new project will become the major part of the Autohan project within the department [6]. Currently, Autohan is an umbrella for various projects to do with controlling devices in the home. Autohan is using a variety of home devices and home networks, including the Warren, mentioned above. Autohan is defining the functionality required in semi-intelligent home devices so that they can be plugged together to work without a controller or specific further programming and configuring. Autohan will also implementing CAL on the Warren in order to aid industry acceptance. Mr Richard Bradbury (an EPSRC research student) is working full time in Autohan. The Event Engine constructed will be approximately along the lines suggested in our paper `Supporting Interactive Presentation for Distributed Multimedia Applications' [7] and in our working document `Event-driven Rules in Autohan [8]. The document defines an event algebra and includes some example events and rules. A typical example is that the television volume is required to be muted if the telephone or the door bell rings. This can be expressed in the event language as follows:
rule tv-mute => TVOn(TVID) -> 
                (telephone-rings(PHONEID) | door-bell(DOORID))
                - TVOff(TVID);
{
    TVID.AudioOut.mute();
}
This expression starts monitoring when a television is switched on. It monitors independently for each television (due to the => transition). This composite event accepts when a telephone or door-bell rings without the television being switched off first. This event will be specificable in natural language with or without gestures. For example:
  1. if a telephone or doorbell rings, mute all the TVs
  2. if the telephone or doorbell rings, mute the TV in the lounge
  3. if the telephone or doorbell rings, mute that TV (+ wand pointing gesture).
The Event Engine will interface at the lower level to APIs which have been developed in the Autohan project for input and output of events and device control. This software is intended to execute in embedded systems and to have no overt user interfaces, therefore it has no upper level APIs. The software will be portable and will be compiled in a Posix environment for testing and development. It may also be compiled to object formats, its intended form, and placed in ROMs or on Web Sites. The use of the Java VM within the software may well be appropriate. The second class of software allows new rules to be added to the system without the use of a screen or keyboard. Sources of rules will be The RA requested will undertake the integration of the infrared wand and speech recognition system with the network control system. S/he will also assist with the porting and integration of the NLP toolkit. However, the research issues concerning the combining of deictic and natural language information from the gesture and speech / text channels of the multimodal interface and concerning the mapping from each channel to the event control language will be investigated by the researcher under the supervision of Briscoe with further informal support from other members of the Cambridge NLP group.

Studentship

The RA will implement the physical parts of the gesturing and natural language input system. This will consist of handheld wands, basestations and software. The Warren project already has a number of IR (infra-red) basestations to support simple home control. These consist of nodes, one per room, connected to the home ATM network, which can send and receive IR pulses in the formats used by all current consumer IR remote controls. The system from the Warren project will be augmented as follows: The voice recognition system will be an off-the-shelf system, such as Dragon Dictate which will be used to transcribe continuous speech into text. The information from the gesture channel will be integrated with the other channels in a loosely-coupled fashion. Both spoken and keyboarded NL input will be passed to the same (ANLT-derived) parsing system. In the mid-term this approach is inadequate because careful correlation of potentially deictic use of pronouns coupled with pointing events will be required to reliably resolve all the potential ambiguities which could occur in multimodal input, and because spoken and keyboarded NL may differ syntactically and lexically for this application. However, to keep the resource requirements minimal for this pilot project, but to develop a prototype interface demonstrating the feasibility of the approach, we will concentrate at this stage on the fundamental issues of mapping to the event control language and coping robustly with contextual dependencies required to support elliptical, anaphoric and deictic input.

Criteria for success

  1. Produce, test and field-trial a working prototype Event Engine.

  2. Produce and have published a white paper on the semantics of programming languages which treat event algebras and imperative sections at the same level.

  3. Determine the extent to which logical and event-based systems are more or less suitable for home control than existing imperative approaches.

  4. Determine the extent to which real users like our approach.

  5. Get our system accepted in the industry.

Foreground References

  1. `Declarative Programming in Escher'
    J. W. Lloyd.
    Tech Report No. CSTR-95-013.
    Department of Computer Science, University of Bristol, Jun 1995.

  2. `Neuron IC and Product Update'
    http://www.mot.com/SPS/MCTG/MDAD/lonworks/lon\_update.html

  3. A. Kung, B. Jean-Bart, O. Marbach and S. Sauvage.
    The EHS European Home Systems Network
    Trialog, 25 rue de G\'{e}n\'{e}ral Foy, 75008 Paris.
    November 1995.
    http://www.trialog.com/ehs.html

  4. Home Plug and Play: CAL-based interoperability for Home Systems
    CEBus Industry Council, 4405 Massachusetts Avenue, Indianapolis, IN 46218,
    USA.
    1997.

  5. Common Application Language (CAL) Specification
    Standard EIA 600.81.
    Electronic Industries Association, 2500 Wilson Boulevard, Arlington, VA
    22201-3834, USA.
    1996.

  6. Autohan WWW project page
    http://www.cl.cam.ac.uk/Research/SRG/HAN/Autohan

  7. Supporting Interactive Presentation for Distributed Multimedia Applications
    Bates, John and Jean Bacon.
    in Multimedia Tools and Applications, 1995, Vol 1. pp 47-48.

  8. Event-driven Rules in Autohan
    http://www.cl.cam.ac.uk/Research/SRG/HAN/Autohan/jobevent.ps