Writing CGI scripts

Next: Active Maps Up: Server Scripts Previous: Common Gateway Interface

Writing CGI scripts

CGI passes the information a script needs into the script in environment variables. The most important two are:

QUERY_STRING: The server will put the part of the URL after the first ``?'' in QUERY_STRING
PATH_INFO: The server will put the part of the path name after the script name in PATH_INFO

For instance, if we sent a request to the server with the URL:
http://www.cs.ucl.ac.uk/cgi-bin/htimage/usr/www/img/uk_map?404,451
and we had cgi-bin configured as a scripts directory, then the server would run the script called htimage. It would then pass the remaining path information `` /usr/www/img/uk_map'' to htimage in the PATH_INFO environment variable, and pass `` 404,451'' in the QUERY_STRING variable. In this case, htimage is a script for implementing active maps supplied with the CERN HTTPD, and is described in more detail in section 8.5.3.

The server expects the script program to produce some output on its standard output. It first expects to see a short MIME header, followed by a blank line, and then any other output the script wants returned to the client. The MIME header must have one or more of the following directives:

Content-Type: type/subtype: This specifies the form of any output that follows.
Location: URL: This specifies that the client should request the given URL rather than display the output. This is a redirection. Some servers may allow the URL to be a short URL specifying only the file name and path - in this case the server will usually return the relevant file directly to the client, rather than sending a redirection.

The short MIME header can optionally contain a number of other MIME header fields, which will also be checked by the server which will add any missing fields before passing the combined reply to the client.

Under some circumstances, the script may want to create the entire MIME header itself. For instance, you may want to do this if you want to specify expiry dates or status codes yourself, and don't need the server to parse your header and insert any missing fields. In this case, both the CERN and NCSA servers recognise scripts whose name begins `` nph-'' as having a `` no parse header'', and will not modify the reply at all. Under these circumstances your script will need access to extra information to be able to fill out all the header fields correctly, and so this information is also available via CGI environment variables.

The full list of CGI environment variable is:

SERVER_SOFTWARE: This holds the name and version of the server that answered the request and is now running your script.
SERVER_NAME: The server's hostname. This is useful if you need to generate URLs referring to this server in your script.
GATEWAY_INTERFACE: The version of CGI that this server complies with. For example `` CGI/1.0''.
SERVER_PROTOCOL: The name and version of the protocol this request arrived with (i.e. the protocol the client speaks) For example, `` HTTP/1.0''.
SERVER_PORT: The port on the server that the request was sent to. Again, this is useful if you need to generate URLs referring to this server in your script.
REQUEST_METHOD: The method of the request. For example, for HTTP, this might be GET, POST, HEAD, etc.
PATH_INFO: The extra path information as given by the client. For example, sending a GET request to a server using the URL
http://www.host/cgi-bin/htimage/usr/www/img/map1
may cause the script htimage to be run from cgi-bin with PATH_INFO set to `` /usr/www/img/map1''.
PATH_TRANSLATED: This contains the data given in PATH_INFO after the server has attempted to translate it into a real path on your filesystem. The result may or may not be meaningful!
SCRIPT_NAME: This is the virtual name and path of the script, as seen in a URL referencing it.
QUERY_STRING: This contains the information contained after the `` ?'' in the URL which caused this script to be executed. The information is just as it came from the URL, without having been URL decoded at all. This is used to hold the coordinate information in active maps, the text query with ISINDEX, the entire encoded form with forms that use the GET method and so on.
REMOTE_HOST: The host name of the machine the client is running on. If the server doesn't know, it should leave this unset and set REMOTE_ADDR instead.
REMOTE_ADDR: The IP address of the machine the client is running on.
AUTH_TYPE: If the server supports user authentication and the script is protected, this is the authentication method that was used to validate the user's identity
REMOTE_USER: If the server supports user authentication and the script is protected, this is the username the user gave to the authentication process.
REMOTE_IDENT: Some servers and client hosts support RFC931 identification, whereby when the client connects to the server, the server queries the client's machine to find the username of the user who made the connection. This information is not always reliable, and will reduce the performance of the server, but may be useful for some logging purposes.
CONTENT_TYPE: Queries such as PUT and POST (which can be used to submit forms) attach information to the body of the request. This is the MIME content type of the body of such a request.
CONTENT_LENGTH: This is the length of the attached information sent with a PUT or POST request.

A simple example of a CGI script written in bourne shell for a Unix system is:

  #!/bin/sh
  FINGER=`which finger`
  echo Content-type: text/html
  echo
  if [ "$QUERY_STRING" = "" ]; then
    echo "<TITLE>Finger Gateway</TITLE>"
    echo "<H1>Finger Gateway</H1>"
    echo "<ISINDEX>"
    echo "This is a gateway to \"finger\". "
    echo "Type a user@host combination in your browser's search dialog.<P>"
  else
    echo "<PRE>"
    $FINGER "$QUERY_STRING"
    echo "</PRE>"
  fi

This generates a page of HTML allowing the user to enter the username of the person to query, unless it's called with a username in QUERY_STRING, in which case it executes the Unix finger command using QUERY_STRING as a parameter, and then returns the result to the user.

Next: Active Maps Up: Server Scripts Previous: Common Gateway Interface

Jon Crowcroft
Wed May 10 11:46:29 BST 1995