Chapter 2 Background A characteristic of continuous media which distinguishes them from the other types of data which are currently found in workstations is that they have genuine temporal requirements. So it is reasonable to expect that the techniques which have been developed for the construction of real­time systems will be applicable to systems in which applications handle continuous media. 2.1 Real­Time A real­time system is one in which the correct operation of the system depends not only on the logical correctness of any computed result, but also on the time at which the result is delivered. A system in which all computed results must be delivered on time is a hard real­time (HRT) system. A system in which the computed results are sometimes allowed to be late is called a soft real­time (SRT) system. A task is a single execution of a body of code. The arrival time of a task is the time at which the system first becomes aware that the task has to be executed. Tasks can be classified in terms of their arrival time characteristics. A periodic task arrives at regular intervals and is characterised by a fixed inter­arrival time or period T (t) and a computation time C(t). Aperiodic tasks are typically characterised by a stochastic arrival rate and may be bursty in nature. Because aperiodic tasks can arrive at any rate, there is the possibility that multiple instances of an aperiodic task could arrive within a short time and overload the system. For this reason, it is commonly assumed that there is a minimum time between the arrivals of different instances of the same task. Tasks with such arrival rates are known as sporadic tasks. There is still a chance that in a 6 system, multiple sporadic tasks will arrive within a short period of time and there will not be sufficient resources available to process them all within the required time. When this happens, a system is said to be experiencing transient overload. A task's release time is the time after which it is allowed to execute; its deadline is the time by which it must complete execution. These timing parameters can be represented on a timing diagram as shown in figure 2.1. This diagram depicts the execution of a task with arrival time A, release time R, computation time C and deadline D. The arrival and release times are indicated by the symbol '' and the deadline by #. A D C Time R Figure 2.1: Timing diagram for a task. 2.1.1 Real­Time Versus Fast With these definitions, it is possible to investigate the effect of resource utilisation on real­time problems. Figure 2.2 shows the timing diagram of a periodic process which runs every 100 milliseconds, requires 20 milliseconds of processing time and has a deadline of 20 milliseconds after its arrival. The speed of the processor has been chosen to be just sufficient to complete the task by its deadline. 0 20 40 60 80 100 120 140 160 180 200 220 240 Figure 2.2: A periodic task. Now suppose that another periodic process is introduced, which requires the same processing time, has the same period, but has a deadline of 40 milliseconds after its arrival. There is then only one order in which the processor can execute both processes so that they complete before their deadlines and this is shown in figure 2.3. The addition of a second task means that the system now has to be able to decide which of the tasks to run first. This is a result of there being some contention between the two tasks for a system resource (the processor) which can only be used by one task at a time. Note that the long term processor utilisation in this example 7 0 20 40 60 80 100 120 140 160 180 200 220 240 Figure 2.3: Two periodic tasks. is only 40%, but that during the interval of interest (the first 40 milliseconds of every period) the processor is fully utilised. Having to make this decision could be avoided by increasing the speed of the pro­ cessor by at least a factor of two. This would halve the amount of time required to execute each task with the result that, regardless of the order in which they would be executed, they would both be completed within 20 milliseconds of being requested. Doubling the processor speed would decrease the overall processor utilisation from 40% to 20%. It is precisely this decrease in processor utilisation which has removed the necessity of having to execute the tasks in a particular order. Additionally, if the results of these processes were to be delivered at a particular time, for example at their deadlines, then even if the processor is so fast that the order in which the tasks are executed is irrelevant, the system still needs to be able to arrange for the responses to occur at a specified time. Increasing processor speed has solved only part of the problem; extra work is required to ensure that the responses, having been calculated, occur at the appropriate times. This is a simple example, but it does present some consequences. Firstly, if tempo­ rally sensitive data is to be handled within a workstation, then attention to schedul­ ing will increase the amount of useful work which is achievable with a given piece of hardware. Secondly, increasing processor speed alone will neither remove nor solve the scheduling problems presented by CM systems; timely presentation of CM data can mean that, between its arrival and release times, a process can do little or no useful work even if the resources it requires are made available to it. Thirdly, the definition of a deadline within a HRT environment implies that the correct result must be delivered by a task before its deadline and that any result delivered after its deadline is of no use. When presenting continuous media, it is sometimes the case that approximate results delivered on time are useful and that correct results have some use if they are delivered only a short time after their deadlines. This suggests that continuous media systems may present some problems which have not yet been completely addressed by the current work on real­time systems. 8 2.1.2 Scheduling Correct implementation of a real­time system involves allocating the available re­ sources to the tasks which require them in such a manner that every task completes before its deadline. This is the purpose of the system scheduler which, whenever invoked, executes a scheduling algorithm to determine when and how to allocate available resources to the known tasks. Such an allocation is known as a schedule. If the allocation is such that all tasks complete before their deadlines, the resulting schedule is called a feasible schedule. In a static real time system, all of the tasks in the system, their arrival times and resource requirements are known a priori, so the scheduling decisions can be made off­line, reducing the run­time scheduling requirements to a triviality. In a dynamic real time system, no prior information is known about arriving tasks, so the scheduler has to make all of its decisions online. To do this, the scheduler is invoked at startup with the initial task set and at various times thereafter. In general, finding optimal schedules for all but the simplest of scenarios is difficult [Mok83]. 2.1.3 Periodic Tasks Periodic tasks are of interest in scheduling because they can be modelled easily and also because many of the activities which occur in control systems are periodic. [Liu73] presents the Rate Monotonic (RM) algorithm which provides a means of scheduling off­line a set of tasks given the following assumptions: 1 (A1) The requests for all tasks for which hard deadlines exist are periodic, with constant interval between requests; (A2) Deadlines consist of run­ability constraints only --- i.e., each task must be completed before the next request for it occurs; (A3) The tasks are independent in that requests for a certain task do not depend on the initiation or the completion of requests for other tasks; (A4) Run­time for each task is constant for that task and does not vary with time. Run­time here refers to the time which is taken by a processor to execute the task without interruption; (A5) Any nonperiodic tasks in the system are special; they are initialization or failure­recovery routines; they displace periodic tasks while they themselves are being run and do not themselves have hard, critical deadlines. 1 These assumptions are quoted directly from [Liu73]. 9 Using RM, a schedule is effected by assigning static priorities to tasks such that tasks with higher arrival rates have higher priorities. The tasks are executed preemptively in a system with a dispatcher which always runs the highest priority, runnable task. The utilisation U of the processor by m tasks can be calculated as U = P m i=1 (C i =T i ). It is shown that if U ! ln 2 then the schedule is feasible. In practice, the constraint U ! ln 2 is conservative, and there are many task sets which can be scheduled using RM whose utilisation exceeds ln 2. [Lehoczky89] extends these results and provides a necessary and sufficient condition for determining whether or not a set of periodic tasks is schedulable using the RM assignment of fixed priorities. [Liu73] also presents a deadline driven algorithm in which the task whose deadline is nearest is assigned the highest priority. It is shown that this algorithm produces a feasible schedule for a given set of m tasks if and only if P m i=1 (C i =T i ) Ÿ 1. Recent developments in the theory of scheduling hard real­time systems have enabled some of the assumptions made in the development of the RM algorithm to be relaxed. [Audsley91] provides a schedulability test for tasks whose deadlines are less than their period, thus removing part of the constraint (A2). Giving a task a tight deadline can be used to control the maximum amount of jitter experienced by the task. Allowing tasks to synchronise via semaphores dispenses with (A3). This can lead to priority inversion where a higher priority task is blocked on a semaphore which is held by a lower priority task --- a situation which violates the assertion that at any time the highest priority, runnable task is running. [Sha90] and [Nakamura93] describe protocols which can be used to limit the amount of priority inversion ex­ perienced by tasks which use semaphores for synchronisation. Assumption (A4) is still required in many techniques for scheduling hard real­time systems. For tasks whose run­time varies, the maximum possible run­time required by the tasks is determined and used in any scheduling calculations. Assumptions (A1) and (A5) constrain the analysis to periodic processes; dispensing with them enables the introduction of sporadic tasks into the system. 2.1.4 Sporadic Tasks [Lehoczky87] presents the Deferrable Server (DS) algorithm for processing sporadic tasks in a system of periodic tasks which has been scheduled using RM. DS creates a server task ø s with period T s and computation time C s , which is allocated a static priority P s according to RM. At the beginning of each period, ø s is allocated C s processor time with which to process any sporadic tasks which may arrive during 10 that period. Any time which is remaining at the end of the period is lost. DS provides a predictable response to sporadic tasks while maintaining the deadlines of the periodic tasks. [Dertouzos74] shows that the Earliest Deadline First (EDF) algorithm is optimal in the sense that if any other algorithm can produce a feasible schedule for a set of tasks whose request times, computation times and deadlines are known, then so can EDF. In contrast to static priority algorithms, EDF is suitable for use in dynamic systems because it requires only a knowledge of the task deadlines to produce optimal schedules. 2.1.5 Overload Behaviour It is useful to compare the behaviour of some of these scheduling algorithms when the system is overloaded. Using RM, the task with the longest period will be the first to miss its deadline as the system becomes overloaded. This may not be what is wanted and is a result of the priorities used by RM failing to take into account the importance of tasks. [Sha86] presents period transformation as a solution to this problem. Using period transformation, the priority which a task is assigned by RM is controlled by altering its period and computation time. A task with run­time C and period T which is assigned priority P by RM is transformed into an equivalent task with shorter run­time C=k and period T=k for some k ? 0 to which RM will assign a priority P \Lambda ? P . Priorities can also be decreased by increasing C and T . Using EDF, the manner in which the system degrades is not easily predictable. [Miller90] presents a predictive deadline scheduler which uses EDF to schedule tasks, and additionally assigns to each task an explicit measure of the importance of the task to the system. Tasks are required to maintain an estimate of their execution times and the scheduler uses these times in conjunction with deadlines and impor­ tance to create feasible schedules during transient overload situations. Both of these methods are retroactive; they allow the system to overload first then shed load according to some specification. Period transformation is a (somewhat artifical) means of conveying this specification to the scheduling algorithm. On the other hand, the deferrable server DS is proactive and avoids transient overload by explicitly controlling the amount of processor time which a sporadic task is allowed to consume. Both RM and the predictive deadline scheduler need to know the computation time of tasks and if these times are bursty, RM will produce schedules with a low processor utilisation. Both cause less important tasks not to receive any processor time at all 11 during overload. The construction of some CM applications can be simplified if they are given some minimal amount of processor time during transient overloads; at the very least, they can detect the passing of time and maintain synchronisation. In principle, DS affords such behaviour during overload; regardless of the arrival rate of the sporadic task, the computation time allocated to DS will always afford a certain minimum amount of service. In practice though, DS is scheduled using RM and is as susceptible to processor starvation during overload as other periodic tasks. 2.2 Continuous Media and Real­Time The preceding discussion has presented a number of techniques which have been developed in the context of HRT systems for handling temporally sensitive applica­ tions. It is necessary to consider the prospect of incorporating into a workstation operating system, sufficient HRT capability to be able to support CM applications in addition to the workstation's usual job mix. This could be done by a simple parti­ tioning of the processor resources using priorities. Assigning to HRT tasks priorities which are higher than best effort tasks would enable the HRT tasks to obtain their execution time guarantees, while enabling the best effort tasks to make use of any of the remaining cycles. Before running a HRT application, a schedulability check such as that provided with the RM priority assignment could be used to decide whether running the new application would cause the current schedule to become infeasible. 2.2.1 Hard Real­Time and Best Effort Job Mixes It has been shown [Tindell93] that, for a job mix consisting of 50% hard real­time periodic processes whose deadline equals their period and 40% sporadic soft real­time load, the response times of the sporadic load would be the same as those observed if the processor were running only the 40% sporadic soft load. This result could be applied to the design of a multi­media workstation to produce a system in which, provided that the multi­media activities consume 50% or less of the processor cycles, they will not be noticed by someone working on the same machine in a sporadic manner, such as when using an editor. An example of a HRT, CM application environment is described in [Jeffay92], where the YARTOS real­time kernel is being used to support live digital audio and video within a conferencing environment. The requirements placed upon this system are that it provide a service as good as that which can be provided by a conventional analogue system. This means that video frames have to be displayed every 33 milliseconds, buffering has to be kept to a minimum, zero or one buffer being allowed 12 within a connection, and great importance is placed on not dropping any frames. These requirements are extreme and are likely only to be encountered when it is desired to produce a high quality display of live video. To satisfy them, YARTOS draws heavily upon the techniques used in the design of hard real­time systems to provide: - support for shared resources to ensure that priority inversion does not occur; - an algorithm for determining whether a given workload can be guaranteed correct execution and; - an algorithm for sequencing tasks on a processor which guarantees that tasks will meet their deadlines. This system is capable of handling demanding application requirements but obtains this capability at the cost of a low utilisation of system resources; there are feasible sets of resource requirements which the schedulability checking algorithm will not guarantee. Such approaches may be acceptable in an isolated machine, where high quality CM presentations are required, but they have a number of drawbacks in a dynamic environment such as a networked workstation. Firstly, they afford only a low level of resource utilisation for the system's real­time task set. Secondly, the techniques used in HRT scheduling algorithms assume that the worst case resource requirements of each HRT process are known. In the case of processing compressed data from a live video source, it is difficult to estimate accurately such worst case requirements. Thirdly, if the applications which are being scheduled make use of a multi­service network to transport CM, and the network provides temporal and logical guarantees which are weaker than those which are required by a distributed HRT application, then the end point applications will be unable to meet their deadlines due to the data not being delivered on time. Finally, as the HRT load increases, the SRT tasks become starved of resources. In such a system, the only way to ensure some degree of timely behaviour from a process is to include it in the HRT task set. To gain an insight into a more flexible approach to resource allocation, the following section examines some of the methods proposed for the allocation of bandwidth in the multi­service networks which are being designed to transport CM data. 13 2.3 Networks A considerable amount of research is currently being undertaken to develop multi­ service networks which are suited to the timely transport of high bandwidth data such as CM. This section examines some of the methods used for allocating network resources to provide a suitable means of transport for CM data. 2.3.1 Transfer Mode In networking terminology, a channel's transfer mode refers to the manner in which messages are multiplexed onto the channel. Using Synchronous Transfer Mode (STM), a fixed proportion of the channel bandwidth is allocated to each of the messages waiting to be sent. This is done when the network is configured by divid­ ing each period of time on the channel into a number of slots, allocating a slot for the transmission of messages from each source, and sending a slot's worth of data from each source once every period. Using Packet Transfer Mode (PTM), the whole of the channel bandwidth is allocated to a message for the duration of its transmis­ sion. Asynchronous Transfer Mode (ATM) is a compromise between these two in which channel time is divided into a number of small, fixed­length slots or cells. 2 Any number of cells may be allocated at the next cell boundary for the transmission of a message. Each of these transfer modes corresponds to a particular scheduling strategy. STM effects a preemptive, round­robin strategy, transmitting slices of multiple messages so that they appear to be being sent at the same rate. PTM corresponds to a non­ preemptive strategy --- once the transmission of a message has been started, the channel is allocated until the entire message has been sent. The regular, frequent cell boundaries characteristic of the ATM provide points throughout the transmission of a message at which the transmission may be preempted, so ATM corresponds to a preemptive scheduling strategy, which allows bandwidth allocation decisions to be made at every cell boundary. In their raw forms, continuous media typically exhibit a Constant Bit Rate (CBR); Audio can be represented as a periodic stream of 8­bit or more samples and video as a periodic stream of frames of fixed size. They also consume a large amount of band­ width so use is often made of the fact that they contain redundant information to transform them into compressed representations. Silence suppression within an au­ dio stream is a simple compression mechanism. Video compression techniques such 2 A 5­byte header and a 48­byte payload have been chosen for B­ISDN, giving a 53­octet cell size. 14 as H.261 [CCITT90] and MPEG [ISO/IEC91] make use of more complex mechanisms such as chrominance sub­sampling, quantisation and selective discarding of spatial frequencies and motion compensation to produce reduced bandwidth representa­ tions of the original stream. These techniques work because of the informational property of CM. Two important consequences of such encodings are that the data streams which they produce may not necessarily be periodic and often will have a Variable Bit Rate (VBR). In the case of silence suppressed audio, the data samples may only be periodic during the talk intervals. In the case of motion compensated video, data may not be produced when there is no difference between a frame and a number of its successors; the picture content of differing frames may also require different numbers of bits to encode. While STM networks are well suited to carrying CBR traffic and are able to provide delay and jitter guarantees when doing so, they are less well suited to carrying VBR traffic due to their inability to allocate bandwidth dynamically. While PTM networks are able to allocate bandwidth dynamically, their inability to do so at short notice and preempt long packets means that they cannot provide the delay and jitter guarantees which STM networks can. The ability to dynamically allocate bandwidth at every cell boundary makes ATM networks suitable for carrying VBR traffic but in order to maintain delay and jitter guarantees, the network bandwidth has to be allocated appropriately. Figure 2.4 shows the effect of transmitting two temporally constrained messages m 1 and m 2 using each of the three transfer modes. m 1 arrives at t = 0, requires 10 units of time to transmit and has a deadline of 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 STM 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 PTM ATM m 1 m 2 m 1 m 2 m 1 m 2 Figure 2.4: Transmission of two temporally constrained messages. 15 t = 17. m 2 arrives at t = 2, requires 5 units of time to transmit and has a deadline of t = 10. 2.3.2 Bandwidth Allocation As has been mentioned, the success of ATM networks will depend on how well network resources can be allocated to the traffic sources which use it. The following sections review some of the concepts that have been developed to facilitate this process. [Bae91] provides a more detailed coverage of the issues. 2.3.2.1 Quality of Service From the point of view of an application, it is desirable for the network to behave predictably. For the network to do this, it needs to have some knowledge of the behaviour of each of the traffic sources it is carrying; the more detailed the knowledge about each of the sources, the better the network is enabled to allocate resources to them in a predictable manner. This situation is commonly abstracted by viewing the network as a provider of services and applications as users of these services. The level of service required by an application and provided by the network is described by a QOS specification. Essentially, a QOS specification has two purposes; firstly, it is used by applications to specify the behaviour which they require from any traffic they send over the network and secondly, it is used by the network to allocate the resources it needs to allocate in order to effect that desired traffic behaviour. Typical QOS parameters include delay, jitter, bandwidth and error rate. 2.3.2.2 Admission Control Having been presented with a QOS requirement by an application requesting a connection, the network has to decide whether or not it has available sufficient resources to provide the desired service quality. If it does, then the application traffic can be accepted; if not, the application should be informed. This process of deciding whether or not to accept a traffic source is called admission control. 2.3.2.3 Policing Once the network and application have agreed upon a particular QOS for some traffic, a two­way agreement is in force. The network, by accepting the traffic source and agreeing to provide the desired QOS to it, should try very hard to maintain 16 that level of service to the application. Similarly, the application, having requested a certain amount of bandwidth in its QOS specification, should not be able to consume any more bandwidth unless it is available and doing so will not interfere with other users of the network. 3 A policing function is required in the network to prevent this sort of interference. 2.3.3 Bursty Traffic Much of the data handled by ATM networks will be bursty; figure 2.5 shows a plot of bits of data per frame of a video sequence which has been coded using an MPEG encoder. 0 5 10 15 20 25 30 35 40 0 50 100 150 200 250 Frame Size (kilobits) Frame Number Min: Max: Avg: 3.95 kilobits 35.19 14.34 kilobits kilobits Figure 2.5: Bits per frame for video encoded with MPEG codec. The encoder has produced three types of frames: I frames contain all the information required to reconstruct a single frame of the original video; P frames contain the information required to predict one frame of the original video from the previous I frame and; B frames contain the information required to predict the original frame 3 This depends considerably on the viewpoint of the network designers. It may be the intention that only the requested amount of bandwidth ever be available regardless of the current utilisation of the network, or it may be that applications may try to make use of any spare bandwidth which fortuitously becomes available. 17 using the previous I or P frame and the next I or P frame. The video has been encoded using the sequence [IBBPBB] + . 4;5 In its original form, each of the frames of this video sequence is approximately 164 kilobits in size. As shown by the plot, the bandwidth of the compressed data varies from 3.9 kilobits per frame to 35.2 kilobits per frame. This burstiness raises the question of how much bandwidth ought to be allocated within a network to carry such traffic. 2.3.4 Traffic Descriptors A simple strategy for admission control and bandwidth allocation would be to allo­ cate the peak bandwidth required by a traffic source. This would have the benefits of always leaving sufficient bandwidth available for carrying the traffic and would require only one number to specify the traffic's bandwidth requirements. One dis­ advantage of this as an allocation strategy is that allocation of peak bandwidth requirements for a large number of bursty sources leads to unacceptably low utilisa­ tion of the overall network bandwidth. Another is that there are applications where the peak bandwidth requirement of such a stream is unknown; if the source of video were a camera recording a live scene, for example, then the bandwidth would depend on the nature of the scene and in such a circumstance, the stream's peak bandwidth would have to be guessed. 2.3.5 Statistical Multiplexing Network designers are investigating the use of statistical multiplexing techniques which allow less than peak bandwidth to be allocated to traffic sources and rely on carrying a large number of streams so that the short­term surplus bandwidth requirements of some of the streams can be met by allowing them to use bandwidth which is not currently being used by the remainder of the streams. Bandwidth in such networks can be allocated in a number of ways, each of which will provide different forms of QOS contracts to an application. If the peak bandwidth B p of a stream is known for the lifetime of the stream, then B a – B p could be allocated and the cell loss rate for the stream would be close to 4 The syntax [pattern] + is used to represent one or more occurrences of pattern. 5 Note that, because the second frame is encoded as a B frame and this requires the previous I frame (frame number 1) as well as the next P frame (frame number 4) to decode, the transmission and decoding order is I[PBBIBB] + . This accounts for the two adjacent peaks at frames 1 and 2 in the figure. 18 that of the hardware bit error rate. 6 Alternatively, some bandwidth B b ! B p could be allocated, resulting in a much higher cell loss rate. More often, the problem is stated in terms of knowing a tolerable cell loss rate, which is dictated by an application, and having to determine the bandwidth to be allocated, based on some knowledge of the traffic source's bandwidth requirements. This may be exact, but typically it will be in the form of some function of probabilities. 2.3.6 Service Degradation The very fact that less than peak bandwidth has been allocated to a traffic source means that it is possible for such a source to produce more data than can be handled by the connection that it has established. When this happens, the network is forced to offer the user a degraded service quality. The manner in which service quality degrades is specific to the service user, and for this reason should be part of the original QOS requirement parameters. For example, a video stream being viewed requires timely delivery of data, but is able to tolerate some loss of the data for small amounts of time. So an acceptable method of degrading service quality during bursts in the video traffic is simply to discard any excess data. 7 File system data, on the other hand, is sensitive to data loss, but often does not have very strict timeliness requirements, so an acceptable method of degrading service quality for this type of traffic would be to wait until sufficient bandwidth becomes available before sending any more data. 2.4 Real­Time and QOS The HRT and QOS paradigms might appear to provide two different approaches to allocating resources in a timely manner --- HRT provides stringent, determinis­ tic guarantees that deadlines will always be met, and much of the current research focusses on the use of QOS to provide a probabilistic assurance that resource re­ quirements will be satisfied a certain fraction of the time. It is worth remembering though, that an application's QOS requirement is a means of specifying the QOS which the application requires from the system, and can encompass requests for service qualities with such stringent guarantees as are provided by HRT. Given that 6 The loss of 1 cell in 10 9 is a commonly cited figure. 7 This has an effect on the coding method used. Coding techniques which strive for high com­ pression ratios tend to be intolerant of errors within the coded data stream. In the example of figure 2.5, the effects of discarding data will propagate only as far as the next I frame. 19 in any real system, there will be a finite probability of failure due to possible soft­ ware defects and a certain mean time between failures for any piece of hardware, one might also consider the difference between the ``guarantees'' offered by HRT and those offered by QOS when asked to provide an error rate close to that of the underlying hardware. A comparison might also be made between SRT and QOS. From the definition pro­ vided in section 2.1, a SRT system is one in which deadlines are sometimes allowed to be missed. Perhaps it is because of the subjective nature this definition, or the perception that SRT systems are generally easier to build than HRT systems and not as useful, that much of the research in real­time systems has been directed at the HRT case. Many of the systems which might employ QOS for resource allocation are SRT. The desire to maintain some form of assertion (albeit one involving proba­ bilities) about the long term frequency with which deadlines will be missed provides a means to reason about the behaviour of the system. From this perspective, QOS can be seen as being roughly equivalent to quantifiable, soft real­time. 2.5 Extending QOS The suitability of QOS as a paradigm for resource allocation for CM traffic in multi­ service networks has been identified, as has the relative inflexibility of HRT for this purpose. This will have implications for the software which is running on systems connected to these networks. 2.5.1 IMAC [Nicolaou91] presents an extensive survey of multi­media systems and applications, then uses this to guide the design of an Integrated Multi­media Applications Com­ munication (IMAC) architecture, the aim of the architectural approach being to describe the components within the system and how they interact. The IMAC architecture recognises the fact that, by their nature, most CM applications are distributed and has as one of its major goals the integration of CM data within a distributed computing environment. So the work presented focusses on system components for generating and presenting media (devices), transporting media (net­ works and protocols), and distributed processing (providing a base set of services which applications can use for distributed processing). The scope of the architec­ ture is quite broad, but of particular relevance to the work which is presented in this dissertation is the treatment of QOS within the architectural design (section 5.2). 20 Within the IMAC architecture, the communications system is viewed as a service provider and applications are viewed as service users. In addition to this, system servers export interfaces which describe the set of operations which can be performed on them. The underlying system makes a series of QOS offers representing the QOS it can support. Each operation within an interface has associated with it a QOS specification, which is used to select the set of QOS offers which are acceptable for use with that particular operation. The invoker of an operation specifies a QOS request, which identifies a single QOS from those selected by the QOS specification for use with the current operation. Throughout the system, QOS is specified in terms which are appropriate to the current level of abstraction, so at the application level for example, QOS appropri­ ate to a video stream might be described simply as compressed pal, while at the communications protocol level, it would more appropriately be described in terms of peak bandwidth, maximum acceptable delay and allowable error rate. To imple­ ment the conversion from one form of QOS to another, a number of constructs are introduced: a QOS domain delineates the scope of QOS offers, specifications and requests; QOS domains are implemented as a set of QOS layers, which represent the points at which QOS is provided; and QOS mapping is provided to convert QOS specifications from one layer to an underlying layer. Two algorithms are provided for the purposes of negotiating QOS: one takes a desired QOS and recursively evaluates a set of conforming QOS specifications; the other determines a suitable QOS request from this set. In IMAC, these algorithms operate on the layers of a communications protocol stack and the results which they return are a private instance of a protocol stack which will provide the QOS desired by the application. Two important points derive from the QOS paradigm presented in IMAC. The first is that QOS is always understood to extend between the end points of any operation; this is a simple consequence of end­to­end arguments in system design. The second is that, while the discussion of QOS within the IMAC architecture focusses on the services provided by a communications system, the algorithms used are quite generic and allow other system components to be represented as QOS layers for the purposes of end­to­end QOS negotiation. 2.5.2 QOS­A [Campbell93] presents an integrated Quality of Service Architecture (QOS­A) in which QOS is used as the single paradigm for resource allocation and scheduling throughout a system. The discussion focusses on the provision of QOS guarantees 21 to support CM communications and presents a number of requirements for a QOS­A. A number of points germane to the current discussion are raised. Within QOS­A, application QOS requirements need to be mapped through all sys­ tem layers to the network so that support for QOS can be provided at all layers in the form of end­to­end QOS negotiation, admission control, policing and mon­ itoring. To do this will require support from the operating system in the form of scheduling resources such as processor cycles and memory. While much of the current work concentrates on the use of QOS in networks and protocols to provide communications guarantees, a generalised QOS­A ought to be extensible to include other areas of QOS provision such as real­time control systems. A consequence of focussing on the use of QOS in networks is that QOS is being seen as a service provider issue, and little attention is being given to service user issues. This is illustrated by the observation that current notions of QOS provide little or no feedback to applications when QOS changes. 2.5.3 End­to­End QOS Both IMAC and QOS­A emphasise that QOS must be maintained between the end points of distributed CM applications. There is little point in having the network deliver CM data according to some QOS requirement if the devices or applications at the end points are not given the resources to handle the data in a corresponding manner. This provides the motivation for extending QOS from the network into the operating system. To this end, the chapters following investigate the consequences of using within the operating system, resource allocation strategies similar to those used within the network. 2.6 Summary The use of HRT and QOS to schedule resources in a timely manner has been re­ viewed, and both have been compared with respect to their suitability for scheduling CM data. While HRT is capable of satisfying stringent timing requirements, it does not readily afford the flexibility required to take advantage of the unique properties of CM. While SRT tolerates missed deadlines, little attention is paid to quantify­ ing the frequency with which deadlines are missed. The use of QOS for scheduling CM within networks has shown that it proves flexible enough to accommodate a wide range of scheduling demands and so is more useful for managing CM. End­to­ 22 end QOS requirements in distributed applications dictate that the operating system be able to provide QOS guarantees at all of the layers used by the applications. This includes scheduling of the system resources used within the communications protocol stacks as well as those needed by applications. The remainder of this dis­ sertation will consider the use of QOS as a means of scheduling applications within a workstation operating system. 23