Next: General Idea and Problems Up: Applications Previous: Generalising the Models

Distributed Virtual Reality

Distributed VR is a relatively new area of research. Virtual reality systems are now largely software components, rather than requiring the dedicated head-up-display input controllers and renderer hardware of the past. Current high-end workstations can now render scenes described in VRML and other languages in near real time. The introduction of audio and video input and output on desktop machines led to the deployment of software based multimedia conferencing, we expect to see the deployment of multi-user virtual environments over the Internet shortly. In this paper, we present an architecture for distributed virtual reality. We outline the necessary network support and a transport protocol, and the way that distributed virtual reality applications would interact (the API if you like) with each other using these mechanisms. It is a goal of the architecture to provide policy free mechanisms for distributed VR application builders. It is not a goal to make it easy to program such applications, since it is all too easy in providing easy-to-program distributed system tools that let the application builder overload the network, and at the same time provide suboptimal performance for the user. The architecture provides necessary and sufficient hooks only for distributed VR. Multicast routing is a mature area of research in the Internet. The system that we now call the ``Mbone''[#!mac:94!#] has its roots in the research by Cheriton and Deering where they developed the Internet multicast model, including the idea of host groups[#!igmp!#] and the basic service model for IP multicast[#!deer:88!#]. It is now widely deployed in the network, and available in most common host operating systems and routers. Some functions in a distributed system can best be performed in intermediate nodes or routers, whilst others can best be performed in end systems or hosts. The end to end principle[#!end2end!#] is used to select where to place a function. The principle is that a function that requires end system knowledge to carry out correctly should only exist in the end systems; where it appears in the network, it should only be as a performance enhancement. Two principles behind the design of high performance, low cost protocols that have proved themselves in this environment are Application Layer Framing (ALF) and Integrated Layer Processing (ILP)[#!ddc:91!#]. These state simply that: the unit of synchronisation and recovery should as far as possible be the same as the unit of communication (packet); and that where possible, modular functionality that is specified in layered form should not be implemented as such, and that new designs communications systems should factor this in so that processing associated with layered modules can be integrated in one pass. Combining these principles with the use of multicast for many-to-many communication, a number of further techniques have arisen for protocol design: Multicasting everything is a good idea. In applications with relatively high packet rates, the use of multicast for control information as well as for user data is not a high load, and can greatly simplify convergence of protocols for correctness, as well as performance and synchronisation. As error rates in modern networks have decreased, end to end recovery from packet loss or re-ordering has been seen to be a more optimal design than hop-by-hop recovery. As we move from one-to-one, through one-to-many and on to many-to-many applications, we can see that the same principle has to be changed. Neither the original sender nor the network can deal with the task of delivering packets in order, and senders cannot know when (or which) receivers are missing packets. It is not a good idea to use a positive-acknowledgement plus timeout-retransmission scheme for multicast applications because of the well known ``implosion'' problem (congestive collapse caused by multiple acknowledgements returning to the sender or senders)[#!uclmtp!#]. Scalable reliable multicast[#!van:95!#] is a technique that has seen wide deployment in LBL's whiteboard application (the so-called ``wb''). Wb uses the principles above to provide a reliable delivery. The protocol and repair algorithm are briefly as follows (paraphrased from[#!van:95!#]): Messages are sent with a sequence number plus a timestamp. There are three basic types of messages: data messages, negative acknowledgements and heartbeat messages. All participants keep state for all other participants, which includes the following: 1. Source address, plus last seen in order sequence number 2. The estimated distance (delay) from this participant to each of the others. In addition, participants keep a number of the most recently received messages available.^8.3 On detecting a missing packet from a gap in the sequence number space (between last received in order and newly received packet), a receiver prepares to send (multicast) a negative acknowledgement, which acts as a request for a repair from any other participant. However, the receiver defers from sending the negative acknowledgement for a time. This time is set so as to cause the set of potential participants sending a negative to (implicitly) conspire so that usually only one (or a small number) of them make the request. To do this, the timer is drawn from a uniform distribution over the range [c1*dsa, (c1+c2)*dsa], where c1 and c2 are constants and da is the requesting participant's estimate of the delay to the source. This time is subject to a binary exponential backoff in the usual manner if there is no response. Participants that receive the request for repair, and wish to try and honour it also dally before sending the repair. Their hiatus is drawn from the distribution [d1*dab, (d1+d2)*dab], to ensure that it is likely that only one (or at least only a few) send the repair. Repair request messages suppress repair requests from other sites missing the same packet. Repair responses suppress other responses (as well as hopefully satisfying the request!). Finally, the delay estimation is based on the same algorithm used in NTP[#!ntp!#]. Heartbeat messages are sent by each participant carrying a list of other participants, together with the timestamp from the last message seen from each participant t1, and the difference, d, t3-t2, between its arrival time, t2 and the heartbeat send time t3. On receiving a heartbeat message at t4, the delay can be estimated as (t4 - t1 - d)/2. This can be calculated using a rolling average, and the mean difference kept for safety if required. (So long as paths are reasonably symmetric, and clock rates not too different, this gives suffices for the repair algorithm above). Some applications require packets to have a particular inter-arrival rate. So long as the network can support the average rate (i.e. there is no long term congestion), ``Receiver makes good'' is generally a low cost solution to dealing with jitter or partial packet re-ordering. It is hard to provide a globally transmission clock in large heterogeneous networks. Essentially, if timestamped, and a receiver clock does not drift or wander w.r.t a sender clock too quickly, a receiver can run an adaptive playout buffer to restore the playout times of packets. The size of the playout buffer is essentially twice the inter-arrival variation, to ensure a significant percentage of packets arrive within the worst case time. A rolling average of the inter-arrival times (kept in any case if the algorithm described above is in use). If the mean delay varies (due to increased or decreased load on the network, or due to route changes) then quiet times can be used to make adjustments. Cheriton[#!cher:95!#] describes a a scheme called log based receiver repair which was devised for distributed simulations in the DIS (the DSINet is an ARPA funded program of work whose target is the development of a suite of systems to support the Synthetic Theatre of War demonstration in 1997. from which much of this research stems). This is similar in spirit to the SRM approach, but has separate log servers rather than expecting all applications to participate in the repair algorithm. The upside of this is that a larger history may be kept (and for applications where the entire history is necessary to reconstruct current state, it may be too costly to distribute to all sites). The downside is that a distinguished server type needs a distinguished protocol to maintain replicant servers and so forth. Synchronisation of messages from different sources is generally a bad thing.[#!lws!#] Open loop protocols such as the types we have described above (and many other heartbeat style protocols such as routing update and reachability protocols) are prone to synchronise send times. This can be avoided by careful selection of randomising timers based on unique participant data (own address is a good example). Multicast applications cannot use positive feedback for reliability. Nor can they use positive explicit feedback for congestion or flow control. Instead, implicit, and aggregated information may be more effective. One scheme for congestion Control and multicast is described by Wakeman et al[#!ian:94!#]. It is important to be flexible. Depending on whether group communication can proceed at the speed of the slowest participant (or link to them) or the average, or be completely heterogeneous, we need different schemes for flow and congestion control. We can separate these also in to sender and receiver based adaptation, and the next item refers to work in this area. In Wakeman's scheme, the sender elicits responses from a receiver set at a given distance by multicasting out a packet with a sliding key, which essentially acts as a selector key to choose some small percentage of recipients to act as samples to report n traffic conditions seen at their pint in the network w.r.t the sender. In the LBL work, this idea is generalized: Multicast receivers keep state about all senders. As with SRM, periodically, they send heartbeat messages, which contain this state (perhaps spreading the state over a set of heartbeat messages to keep the size of the updates small). This state can be used by senders to estimate the conditions at all points in the network. To keep heartbeat/session traffic to a reasonable level, the rate of the beat is reduced in proportion to the number of participants. Although this means that the rate of samples from any given system is decreased for a larger group, the number of samples stays the same. In fact, as a group gets larger, it is statistically reasonable to assume that it is more evenly and widely distributed, and may give better and better samples concerning traffic conditions throughout the multicast distribution fabric. McCanne, on his work on multicast video, has looked at layered media encodings, and how they may be mapped onto multiple groups.[#!mccanne:96!#] In this work, different levels of quality (or urgency, or interest) are sent to different group addresses. The number of levels, and amount that is sent in each can be adjusted using schemes such as the one described just before. However, receivers can independently adjust the rate of traffic that arrives at them simply by joining (to increase) or leaving (to decrease) one of more groups, corresponding to the appropriate coding levels. Flow and Congestion Signal actions are generally the same for multicast applications as for unicast. The stable, and safe algorithm used in TCP since 1988[#!van:88!#], with a slow start cycle, a congestion control cycle with exponential backoff and linear increase, and a fast retransmit cycle to avoid short lived congestion can be employed. The rest of this paper is structured as follows: In the next section, we outline the structure of a distributed virtual reality system. After that we look at some of the real requirements from distributed VR, both from systems and the human perspective. Following that we present the transport protocol. Finally we look at further work that needs to be done. General Idea and Problems Virtual Reality Operations, User Views and Networ Considerations Application Model The Distributed Virtual Reality Multicast Protocol, DVRMP Next: General Idea and Problems Up: Applications Previous: Generalising the Models Jon CROWCROFT 1998-12-03