In the words of the authors,
The Real-Time Streaming Protocol (RTSP) establishes and controls either a single or several time-synchronized streams of continuous media such as audio or video...In other words, RTSP acts a ``network remote control'' for multimedia servers.As we have already discussed above, the interactions of a client with a server using a network remote control is best modelled by the use of an rpc mechanism. However, rather than using an rpc mechanism directly, the designers of RTSP decided to use a variation on HTTP, since in its current incarnation of version 1.1, it approximates an application level rpc mechanism. The designers felt that they could leverage the work already done in producing HTTP to produce production code for RTSP clients and servers9.2, and in particular, use the technology developed for proxies and security.
Multimedia presentations are identified by URLs, using a protocol scheme of ``rtsp''. The hostname is the server containing the presentation, whilst the port indicates which port the RTSP control requests should be sent to. Presentations may consist of one or more separate streams. The presentation URL provides a means of identifying and controlling the whole presentation rather than coordinating the control of each individual steam. So,
rtsp://media.example.com:554/twister/audiotrackidentifies the audio stream within the presentation twister, which can be controlled on its own. If the user would rather stop and start the whole presentation, including the video, then they would use the URL:
rtsp://media.example.com:554/twister/
RTSP add a number of new requests to the existing HTTP requests. These are
The media streams are left unspecified by RTSP. These could be RTP streams, or any other form of media transmission. RTSP only specifies the control and its up to the client and server software to maintain the mapping between the control channel and the media streams.
A key concept in RTSP is the notion of a session. RTSP works by first requesting a presentation to be started by a server, receiving in return a session identifier which it then uses in all subsequent controls. Eventually, the client can request the teardown of session, which releases the associated resources. The session identifier represents the shared state between the client and server. If the state is lost, for example through one of the machines being rebooted, then the protocol relies on the transport of the media stopping automatically, eg. through not receiving RTCP messages if using RTP, or the implementation using the GET_PARAMETER method below as a keep-alive.
The control requests and responses may be sent over either TCP or UDP. Since the order of the requests matters, the requests are sequenced, so if any requests are lost, they must be retransmitted. Using UDP thus requires the construction of retransmission mechanisms, so there are very few occasions when the application can get away with using UDP.
The most obvious additions to the request header fields are a Cseq field to contain the sequence numbers of requests generated by the client, and a Session field to both request and response headers to identify the session. Session identifiers are generated in response to a SETUP request, and must be used in all stateful methods. The Transport field allows the client and server to negotiate and set parameters for the sending of the media stream. In particular, it allows the client and server to set ports and multicast addresses for the RTP streams. There are a number of other header fields, such as the time range of the presentation upon which the method executes (Range), and various fields which interact with caches and other proxies.
Descriptions of session use the Session Description Protocol
(described in Chapter ), which provides a generic
technique for describing the details of the presentation, such as
transport and media types of the stream, and the presentation content.
Importantly, it also provides the start and end times of the
presentation, so that the client can PLAY form and to any point in the
presentation they wish.
Media streams are referenced through specification of their times, either relative to the start time of the presentation, or in real time. Rtsp allows the use of the standard time codes used in industry such as SMTPE or Normal Play Time, or by specifying an absolute time for presentations in real-time.
To display a presentation, the client software first requires the Rtsp URL of the presentation. If it has this URL, it can then display the presentation by following these steps.
Rtsp is intended to be a generic protocol for manipulation of continuous media over the Internet. For a given presentation or server, there may be limitations upon the controls the client can use. For instance, if a company has created a media presentation, they may well desire that a session is not recorded. Since the controls may sometimes be present and at other times not, this presents a problem in the design of a user interface. If a control is represented in the user interface, but isn't available for a particular presentation, then the user may attempt to use the control and be confused by the subsequent interactions.
Fortunately, the OPTIONS command allows the client to interrogate the server to determine the available methods, and the component streams. Once these have been determined, the client software can then build an appropriate interface, representing only the available control, and presenting an indication of the components of the stream.
Next: Movies on Demand
Up: Remote Control of Playback
Previous: An Aside - The
Jon CROWCROFT
1998-12-03