6.3. Automatic indexing of directories

The second approach we will consider is that of automatically generating a list of the entries in the directory. This used to be a popular approach to creating websites, especially those mainly consisting of files to download, but it's now falling out of favour.

6.3.1. Basic listings

The relevant module is rather old and clunky, hailing back to the days when browsers didn't support tables in HTML, but it is in very widespread use so we need to consider it. We will start by loading the module and removing dir_module (for simplicity at this stage).

This is also the largest module (in terms of number of commands) we will cover in this course. If you can cope with this one, you can cope with any of them.


LoadModule	autoindex_module /usr/lib/apache2/mod_autoindex.so

If we just load the module then we see that, instead of getting a 404 "Not found" error we get a 403 "Forbidden" error instead.

This can be confusing, because 'Forbidden' is more commonly associated with access control. In this case you are seeing it because the web server has been configured to handle directories but by default won't do so. As with symbolic links above (see Chapter 5) we need to set an option to instruct the module to do its job. Note that this use of Options follows the loading of the module. Several options we will meet rely on a specific module and their use must follow the LoadModule line in the configuration file.


LoadModule	autoindex_module /usr/lib/apache2/mod_autoindex.so
Options	+Indexes

And now, if we ask for the / URL we get the list of the files and directories that appear in the top-level directory. We've included some additional files to make things more interesting.

6.3.2. Improving the listings

By default, the index produced is a very simple one (an itemised list in HTML). The module provides a number of commands to make it a bit more interesting. To see it in operation we will add a simple IndexOptions command to our configuration file to turn on "fancy indexing".


IndexOptions	FancyIndexing

Next we will suppress certain rows from the listing. Why would we want to do this? Well, suppose the web developers edit their files in place (i.e. in the directory managed by the web server) with an editor (emacs, say) that while editing a file (alpha.html, say) creates work files (#alpha.html#) while it is running and leaves behind backup files (alpha.html~) when it is finished. We don't want these files appearing in the listings. We do this with the IndexIgnore command.

IndexIgnore	"#*#"  "*~" ".*"

Note that the expressions to be ignored are placed in quotes. This is not typically necessary but under certain circumstances it is required. In this case the "#" character is the comment character in httpd.conf files. If it was not enclosed in quotes then everything on the IndexIgnore line beyond the first "#" would be ignored.

Warning

Just because a file name is not in the listing does not mean that it cannot be downloaded. If I see alpha.html and guess that there might be an alpha.html~ I can still request it and the server will serve it to me. We will deal with blocking these downloads in Section 10.7.

In addition to having a listing of files, it is possible to place text above and below the listing. This can either be in the form of plain text or full-blown HTML. We will concentrate on the latter.

To add HTML above the listing the configuration must identify a header file. This file must have a name that identifies it as having MIME content type text/html. In the simple case, however, the file's content, should not be a full HTML document but just the HTML body component (without the leading BODY tag) for the text to appear above the listing. Everything else will be automatically generated. We identify this file (should it exist) with the HeaderName command.

HeaderName	HEADER.html

A suitable HEADER.html file might look like this:

<p>Here is an HTML fragment.</p>

<p>It will automatically appear above the auto-generated file listing.</p>

Note that the HEADER.html file appears in the listing too. Typically this is not wanted as it is already "doing its job" by having its contents appear at the top of the page. The file HEADER.html would be a good candidate for the IndexIgnore command.

The next prettying up of the listing will be to add icons to the listing. Typically, icons are used to represent the MIME content type of the file. We will use the icons in the /usr/share/apache2/icons/ directory which are provided for this purpose.

We are immediately presented with a problem. The icons directory is not in either web site's DocumentRoot. We could copy the directory or symlink to it, but in this case we are going to introduce another facility: aliasing. This comes courtesy of the alias_module module and its Alias command.


LoadModule	alias_module	/usr/lib/apache2/mod_alias.so
Alias		/icons/		/usr/share/apache2/icons/

The Alias command overrides the DocumentRoot for specific URLs. In this case any URL whose local part starts with /icons/ (n.b. the trailing slash) will be looked up in /usr/share/apache2/icons/. If we place this directive before the definitions of the virtual hosts then it will apply to both.

Once the module has been loaded, the Alias command may be run multiple times, both inside and outside of the virtual host sections. If it appears within a virtual host's paragraph then it applies to just that virtual host.

The file icon.sheet.png in the icons directory gives a quick lookup of all the icons provided. Now we have access to the icons we need to know how to make use of them in directory listings. The auto-indexing module provides a slew of commands for this purpose. The trick to producing self-consistent indexes is to use as few as possible. We will set up distinct icons for the following entries.

Categories with distinct icons

The command that associates an icon with a MIME content type is AddIconByType. However, we will also specify the ALT text for text-based browsers with the analogous AddAltByType command. While we are at it, we will supply a DefaultIcon to use when nothing else matches.


AddIconByType   /icons/layout.gif       text/html
AddAltByType    "HTML file"             text/html
AddIconByType   /icons/text.gif         text/plain
AddAltByType    "Plain text"            text/plain
AddIconByType   /icons/generic.gif      text/*
AddAltByType    "Text"                  text/*
AddIconByType   /icons/image2.gif       image/*
AddAltByType    "Static image"          image/*
AddIconByType   /icons/sound1.gif       audio/*
AddAltByType    "Audio"                 audio/*
AddIconByType   /icons/movie.gif        video/*
AddAltByType    "Video"                 video/*
AddIconByType   /icons/ps.gif           application/postscript
AddAltByType    "PostScript"            application/postscript
AddIconByType   /icons/pdf.gif          application/pdf
AddAltByType    "PDF"                   application/pdf

DefaultIcon     /icons/ball.gray.gif

Note: The icons are supplied in GIF and PNG format. Normally I would recommend using the PNG icons rather than the GIF ones since PNG is technically a better format and not troubled by patent problems. However, whoever converted the GIFs to PNGs got the background transparency wrong so you should use the GIF icons for the time being until the PNGs are fixed.

We still have a problem with directories. There is no MIME content type for a directory so we must use other facilities. The following is a filthy hack introduced by Apache version 1 and preserved into version 2.


AddIcon         /icons/dir.gif          "^^DIRECTORY^^"
AddAlt          "Directory"             "^^DIRECTORY^^"
AddIcon         /icons/back.gif         ".."
AddAlt          "Up"                    ".."

Conclusion. And now our listings look a bit more colourful. But this is a lot of effort for limited presentational value.

6.3.3. Using an HTML table

We commented above that the data presented in the file listings is inherently tabular and would be better presented as an HTML table. This is now available, as an "experimental feature" in versions of Apache beyond 2.0.23. (SLES 10 ships with 2.2.3). The authors can find no mechanisms for setting the attributes of the table from within Apache except to use stylesheets in the header file for the directory.

IndexOptions HTMLTable

6.3.4. Summary of the auto-indexing module

In the summary of the commands provided by autoindex_module given below only the commands and options discussed in this course are covered. There are many more. If you can't get the result you want with the commands given to date then consult the full Apache documentation. You might get lucky.

Note: Commands and options that only make sense if fancy indexing is turned on are marked with an "(f)".

Syntax summary: autoindex_module

IndexOptions

Sets various parameters for how the index should look. The list below gives the options.

IndexIgnore "name" "name" ...

Takes a list of filenames or shell-style wildcarded filenames for file names. Files whose names match one or more of the patterns are not listed in the index.

AddIconByType icon mime_type (f)

Specifies the icon that should be used for a particular MIME content type. The MIME content type can either be fully specified (e.g. text/html) or partially specified (e.g. text/*).

AddAltByType "text" mime_type (f)

Specifies the ALT attribute in the <IMG/> tag. If you are expecting text-only browsers you might want to keep this short and of constant width (three characters is traditional). Alternatively, ditch the icons altogether.

DefaultIcon icon (f)

This specifies the icon to be used if nothing else matches. There does not appear to be an equivalent DefaultAlt command.

AddIcon icon name (f)

This specifies an icon for a particular file name. Typically this should be avoided but it is the best way to match the parent directory .. and other directories with the pseudo-filename ^^DIRECTORY^^.

AddAlt "text" name (f)

This specifies ALT text alongside AddIcon's images.

HeaderName name name ...

This identifies the file whose contents should be placed above the file listing. The first file in the list that exists is used. These file names typically appear in the IndexIgnore instruction.

The files can be either plain text, an HTML body fragment or an entire "top half" of an HTML page. To stop the server adding its own HTML top half see the IndexOptions option SupporessHTMLHeader.

ReadmeName name name ...

Exactly as HeaderName but it corresponds to the text below the listing. This can only be plain text or an HTML body fragment.

Syntax summary: Options to IndexOptions

FancyIndexing

Turns on the four-column (by default) indexing mode rather than plain, bullet-list indexing mode.

HTMLTable (f)

This instructs Apache to use an HTML table rather than a <PRE> block to present the listing.