The second approach we will consider is that of automatically generating a list of the entries in the directory. This used to be a popular approach to creating websites, especially those mainly consisting of files to download, but it's now falling out of favour.
The relevant module is rather old and clunky, hailing back to the days when browsers didn't support tables in HTML, but it is in very widespread use so we need to consider it. We will start by loading the module and removing dir_module (for simplicity at this stage).
This is also the largest module (in terms of number of commands) we will cover in this course. If you can cope with this one, you can cope with any of them.
If we just load the module then we see that, instead of getting a 404 "Not found" error we get a 403 "Forbidden" error instead.
This can be confusing, because 'Forbidden' is more commonly associated with access control. In this case you are seeing it because the web server has been configured to handle directories but by default won't do so. As with symbolic links above (see Chapter 5) we need to set an option to instruct the module to do its job. Note that this use of Options follows the loading of the module. Several options we will meet rely on a specific module and their use must follow the LoadModule line in the configuration file.
And now, if we ask for the / URL we get the list of the files and directories that appear in the top-level directory. We've included some additional files to make things more interesting.
By default, the index produced is a very simple one (an itemised list in HTML). The module provides a number of commands to make it a bit more interesting. To see it in operation we will add a simple IndexOptions command to our configuration file to turn on "fancy indexing".
Next we will suppress certain rows from the listing. Why would we want to do this? Well, suppose the web developers edit their files in place (i.e. in the directory managed by the web server) with an editor (emacs, say) that while editing a file (alpha.html, say) creates work files (#alpha.html#) while it is running and leaves behind backup files (alpha.html~) when it is finished. We don't want these files appearing in the listings. We do this with the IndexIgnore command.
Note that the expressions to be ignored are placed in quotes. This is not typically necessary but under certain circumstances it is required. In this case the "#" character is the comment character in httpd.conf files. If it was not enclosed in quotes then everything on the IndexIgnore line beyond the first "#" would be ignored.
Warning |
Just because a file name is not in the listing does not mean that it cannot be downloaded. If I see alpha.html and guess that there might be an alpha.html~ I can still request it and the server will serve it to me. We will deal with blocking these downloads in Section 10.7. |
In addition to having a listing of files, it is possible to place text above and below the listing. This can either be in the form of plain text or full-blown HTML. We will concentrate on the latter.
To add HTML above the listing the configuration must identify a header file. This file must have a name that identifies it as having MIME content type text/html. In the simple case, however, the file's content, should not be a full HTML document but just the HTML body component (without the leading BODY tag) for the text to appear above the listing. Everything else will be automatically generated. We identify this file (should it exist) with the HeaderName command.
A suitable HEADER.html file might look like this:
<p>Here is an HTML fragment.</p> <p>It will automatically appear above the auto-generated file listing.</p>
Note that the HEADER.html file appears in the listing too. Typically this is not wanted as it is already "doing its job" by having its contents appear at the top of the page. The file HEADER.html would be a good candidate for the IndexIgnore command.
The next prettying up of the listing will be to add icons to the listing. Typically, icons are used to represent the MIME content type of the file. We will use the icons in the /usr/share/apache2/icons/ directory which are provided for this purpose.
We are immediately presented with a problem. The icons directory is not in either web site's DocumentRoot. We could copy the directory or symlink to it, but in this case we are going to introduce another facility: aliasing. This comes courtesy of the alias_module module and its Alias command.
The Alias command overrides the DocumentRoot for specific URLs. In this case any URL whose local part starts with /icons/ (n.b. the trailing slash) will be looked up in /usr/share/apache2/icons/. If we place this directive before the definitions of the virtual hosts then it will apply to both.
Once the module has been loaded, the Alias command may be run multiple times, both inside and outside of the virtual host sections. If it appears within a virtual host's paragraph then it applies to just that virtual host.
The file icon.sheet.png in the icons directory gives a quick lookup of all the icons provided. Now we have access to the icons we need to know how to make use of them in directory listings. The auto-indexing module provides a slew of commands for this purpose. The trick to producing self-consistent indexes is to use as few as possible. We will set up distinct icons for the following entries.
Categories with distinct icons
HTML web pages
Plain text pages
Any other "text" format
Any image format
Any audio format
Any movie format
PostScript
Portable Document Format (PDF)
Any other file content type
Subdirectories
The parent directory
The command that associates an icon with a MIME content type is AddIconByType. However, we will also specify the ALT text for text-based browsers with the analogous AddAltByType command. While we are at it, we will supply a DefaultIcon to use when nothing else matches.
AddIconByType /icons/layout.gif text/html AddAltByType "HTML file" text/html AddIconByType /icons/text.gif text/plain AddAltByType "Plain text" text/plain AddIconByType /icons/generic.gif text/* AddAltByType "Text" text/* AddIconByType /icons/image2.gif image/* AddAltByType "Static image" image/* AddIconByType /icons/sound1.gif audio/* AddAltByType "Audio" audio/* AddIconByType /icons/movie.gif video/* AddAltByType "Video" video/* AddIconByType /icons/ps.gif application/postscript AddAltByType "PostScript" application/postscript AddIconByType /icons/pdf.gif application/pdf AddAltByType "PDF" application/pdf DefaultIcon /icons/ball.gray.gif
Note: The icons are supplied in GIF and PNG format. Normally I would recommend using the PNG icons rather than the GIF ones since PNG is technically a better format and not troubled by patent problems. However, whoever converted the GIFs to PNGs got the background transparency wrong so you should use the GIF icons for the time being until the PNGs are fixed.
We still have a problem with directories. There is no MIME content type for a directory so we must use other facilities. The following is a filthy hack introduced by Apache version 1 and preserved into version 2.
AddIcon /icons/dir.gif "^^DIRECTORY^^" AddAlt "Directory" "^^DIRECTORY^^" AddIcon /icons/back.gif ".." AddAlt "Up" ".."
Conclusion. And now our listings look a bit more colourful. But this is a lot of effort for limited presentational value.
We commented above that the data presented in the file listings is inherently tabular and would be better presented as an HTML table. This is now available, as an "experimental feature" in versions of Apache beyond 2.0.23. (SLES 10 ships with 2.2.3). The authors can find no mechanisms for setting the attributes of the table from within Apache except to use stylesheets in the header file for the directory.
In the summary of the commands provided by autoindex_module given below only the commands and options discussed in this course are covered. There are many more. If you can't get the result you want with the commands given to date then consult the full Apache documentation. You might get lucky.
Note: Commands and options that only make sense if fancy indexing is turned on are marked with an "(f)".
Syntax summary: autoindex_module
Sets various parameters for how the index should look. The list below gives the options.
Takes a list of filenames or shell-style wildcarded filenames for file names. Files whose names match one or more of the patterns are not listed in the index.
Specifies the icon that should be used for a particular MIME content type. The MIME content type can either be fully specified (e.g. text/html) or partially specified (e.g. text/*).
Specifies the ALT
attribute in the
<IMG/>
tag. If you are
expecting text-only browsers you might want to keep this
short and of constant width (three characters is
traditional). Alternatively, ditch the icons
altogether.
This specifies the icon to be used if nothing else matches. There does not appear to be an equivalent DefaultAlt command.
This specifies an icon for a particular file name. Typically this should be avoided but it is the best way to match the parent directory .. and other directories with the pseudo-filename ^^DIRECTORY^^.
This specifies ALT
text alongside
AddIcon's images.
This identifies the file whose contents should be placed above the file listing. The first file in the list that exists is used. These file names typically appear in the IndexIgnore instruction.
The files can be either plain text, an HTML body
fragment or an entire "top half" of an HTML
page. To stop the server adding its own HTML top half
see the IndexOptions option
SupporessHTMLHeader
.
Exactly as HeaderName but it corresponds to the text below the listing. This can only be plain text or an HTML body fragment.
Syntax summary: Options to IndexOptions
FancyIndexing
Turns on the four-column (by default) indexing mode rather than plain, bullet-list indexing mode.
HTMLTable
(f)This instructs Apache to use an HTML table rather
than a <PRE>
block to
present the listing.