WWW servers generally reside on machines with a file system. The server's job is to make part of that file system publicly available by responding to HTTP requests. Its job is also to prevent the private parts of that file system from becoming public.
Most file systems can be thought of as a form of tree, and the URLs used in the World Wide Web also use this model. Thus the URL:
http://www.cs.ucl.ac.uk/misc/uk/london.htmlspecifies the file called london.html which is in a directory called uk, which in turn is in a directory called misc. misc is also a directory, and it resides in the top level directory of the tree, which is sometimes simply called `` /'' (pronounced `` slash'').
Figure 5.1: The structure of a Uniform Resource Locator
The slashes (`` /'') separating the directory names are the Unix way of specifying a file name. On DOS and Windows systems, users are more used to backslashes (`` ''). Many Apple Mac users aren't familiar with this concept at all, although their folders do actually perform the same task. Although users don't often see it, Apple scripts use the colon (`` :'') where Unix uses a slash. However, when you're writing URLs, whatever system you're on, you must use slashes. If you're more used to folders than directories, simply substitute the word ``folder'' wherever we say ``directory''!
When the URL above specifies /misc/uk/london.html, this does not usually mean that the misc directory is really situated in the root directory of the entire file system. Instead it is situated in the root directory of the subtree that the WWW server makes public. Any documents situated in this subtree are accessible to the server, and directories that are not in this subtree are not accessible. See figure 5.2
Figure 5.2: A WWW server makes a subtree of the filesystem public
Now, with some servers, this is the whole story. However, most servers also allow you to provide some form of access control to files and subdirectories of the visible subtree. This protection can take the form of restrictions on which machines or networks a client can access a file from, or it may take the form of password protection. Which mechanisms a server provides depend on which server you choose, and we'll discuss a few of the better servers later in this chapter.
Another issue is raised where a server is running on a machine in a large multi-user environment such as a university. For instance, each student in a university can write files to their own filestore, but not anywhere else. However, we'd like our students to be able to create their own WWW pages, despite not having access to the WWW server's default public tree. WWW server designers have foreseen this need, and Unix servers usually make available files placed in a special directory in the user's home directory. On NCSA and CERN servers, this directory is called `` public_html'' by default. Thus accesses to the URL
http://www.euphoric-state-uni.edu/~janet/research/index.htmlwould map onto the file:
/usr/home/janet/public_html/research/index.htmlin the filesystem shown in figure 5.3
Figure: An WWW server exporting a user's public_html pages
Once we start to allow the WWW server access to areas of our filesystem which can be modified by users that we don't necessarily trust, a whole set of security issues are raised. For instance, Unix allows symbolic links from one place in the directory tree to another to give the impression that files or directories are someplace else (Mac's call symbolic links ``Aliases''). Letting the server follow links can be useful, but it also can create problems. Just because a file is readable by other users on your own system does not necessarily mean it should be readable by users in other sites or countries!
Figure 5.4: Problems with a WWW server following symbolic links
In figure 5.4, we see that Janet has made a symbolic link from inside her `` public_html'' subtree to John's `` new_project'' directory making it accessible to the whole world without John's knowledge. Most servers allow different security options to be specified on a per subtree basis, and in this case, if following symbolic links had been switched off for public_html directories, the problem would have been avoided. MacHTTP simply prohibits the following of links, which is another way to solve the problem.