The following instructions are intended to aid existing users who have legacy mail processing enabled. If you are using Hermes Webmail Service, please visit the University Information Services website for instructions and further information on e-mail filtering.
Tools are available for the Lab mail user to write really rather complex instructions for the mail system to interpret before the mail is delivered. This document shows what you can do, and how to do things people commonly want to do.
There are, of course, things you cannot do. A signficant one is that you may not “bounce” mail (i.e., refuse the connection) — that is something only the system itself is allowed.
The filter language
The filter language allows you to write simple linear scripts, with conditional discrimination based on the content of the message (and possibly the external context). The filter resides in the same .forward file that is used for simple forwarding. Obviously the mail system needs to distinguish the two sorts of file, and so a filter file must start with the characters:
# Exim filter
which has the disadvantage that it doesn’t look remarkable enough and is prone to getting deleted. So you are recommended to write:
# Exim filter <<-- do NOT delete this line
(anything on the line after “exim filter” is treated as comment, and the words may be in upper- or lowercase.)
You are not recommended to read the filter language specification as an introductory text. Rather, read the rest of this document for an outline of the language and a few useful examples, and then use the specification as a reference guide.
The useful commands in the filter language
The filter language has a simple structure; you typically use an extended conditional tree to discriminate between various types of message, and then perform some action at each leaf node of that tree.
The conditional tree is made up of if, then, elif (meaning “else+if”), else and endif commands. So the general structure of a tree is
if ( condition )
elif ( condition )
The conditions are built up from relations between variables and strings or numbers; elif is an abbreviation for else if, and endif ends a conditional command group. The string relations may be “is”, “contains”, “matches”, “is not”, “does not contain” or “does not match”. A variable is an element of the message, or some other important feature of what’s going on. It may be
- some part of the header, such as “$h_from” or “$h_subject”,
- a body-part, such as “$message_body” (the first few thousand characters of the message) or “$message_body_end (the last few thousand characters of the message),
- a statistic, such as “$message_body_size,
- a derived value, such as “$home” (the user’s home directory, of course), “$recipients” (the set of people who will receive this message), and so on.
- a script variable, such as “delivered”, which records whether anything has actually delivered the message in the script-so-far.
In the case of “matches” and “does not match”, the thing to match against is a regular expression, using a pretty close match to Perl language syntax. You can test your regular expressions using the command pcretest:
Discriminating between messages
Useful discriminators are:
- Who sent the message — $h_from and $h_sender
- Who the message was sent to — $h_to and $h_cc
- Whether this mail was directly addressed to you — builtin condition “personal” detects if the mail was addressed to you, or was one you sent and copied to yourself.
- The subject of the message — $h_subject
- The size of the message — $message_body_size
The actions you can take
There are a small number of basic actions you may take, having identified a message class:
- saves a message to a file. This is in fact the
action the mailer takes by default, saving a message into your
incoming mail file by the equivalent of
The save command marks the message as delivered (see unseen, below). A common way to end a filter script is the command
if not delivered then save .mail endif
meaning “if all else fails, deliver to the ‘default’ place”.
- passes the message to an “approved” command
(commands that are available are specifically selected: don’t try
random Unix commands that might be helpful).
The pipe command marks the message as delivered (see unseen, below).
- is a command you may pipe into, which puts
the message as into a folder. The process performed by the
mh inc command is equivalent to doing
pipe "rcvstore +inbox"
at the delivery of each message.
The message folder you want to rcvstore mail into must exist beforehand; if it doesn’t, mail may be lost (or, if you’re lucky, merely delayed). See Checking the filter, below.
- marks the message as delivered, without doing
anything. Uses for this facility are few and far between; the
acts the same (abandoning the message) as does
on its own.
- modifies a command to suppress the “mark as
delivered” action. So neither of
unseen pipe "rcvstore +foo"
unseen save .bar-mail
will mark the message as delivered.
- marks the message as having been delivered. There is probably no value in doing this; if you want to abandon a message, merely issue the command finish (see below), rather than falling through and hoping the system will believe you’ve not seen it.
- stops processing. You may do this at the end of a condition; an implicit finish is inserted at the end of the filter file if you’ve not put one there yourself.
Disposing of spam
External mail coming to the department passes through a computing service ‘mail hub’, which checks for (and removes) viruses, and runs the message through a spam-detector. The spam detector adds header information about its reaction to the message: here’s one from a recent bit of rubbish:
X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ X-Cam-AntiVirus: No virus found X-Cam-SpamDetails: scanned, SpamAssassin (score=8.7, FORGED_YAHOO_RCVD 2.29, HABEAS_HIL 4.00, HTML_60_70 0.10, HTML_MESSAGE 0.10, MIME_HTML_ONLY 0.10, RAZOR2_CHECK 2.06) X-Cam-SpamScore: ssssssss
The URL on the first line tells you about the scanner, what it does, and what it’s protecting you against. The second line reports that there was no detectable virus in the mail. The third (and fourth and fifth) line(s) tell you what the spam scanner thought of the mail: this isn’t terribly enlightening, unless you’re working on the scanner itself — the important thing is the score, which in this case is 8.7. The final line is that value, coded so as to be easily spotted by a mail filter; there are 8 “s”s there. This particular mail was trapped by a statement:
if ( $h_X-Cam-SpamScore contains "sssssss" ) then pipe "rcvstore +spam" finish endif
The condition mentions 7 “s”s, and there were 8 in the message, so the condition worked. In fact, mail with scores less than 7 are pretty frequently spam, but once you get down to really low scores, you’re in danger of rejecting legitimate mail. A common way round this is:
if ( $h_X-Cam-SpamScore contains "sssss" ) then save .mail.spam finish elif ( $h_X-Cam-SpamScore contains "sss" ) then pipe "rcvstore +spam-suspect" finish endif
The first condition spots mail with spam score 5 or greater, and simply saves it to a file. The second condition spots mail with a score of 3 or 4, and puts it in a folder “spam-suspect”; you should scan the folder regularly, to check that important mail is not being missed. (The scan can be pretty straightforward: it’s usually possible to tell that mail is significant on the basis of its author — who it’s from — and its subject.)
A folder for a mailing list
If you subscribe to a mailing list that you regard as background reading, only, you may wish to move messages to the list direct to a folder. (There are of course other reasons you might care to do this.)
Some mailing lists generously put a tag in every subject line:
Subject: [sibelius-list] Spot the Difference
orSubject: Re: [sibelius-list] Spot the Difference
In such a case, we use a match on the subject line:
if ($h_subject contains "[sibelius-list]") then pipe "rcvstore +sibelius" finish endif
Another common discriminator is the sender, or the address the mail was sent to:
In this case, two different mailing lists are going to end up in the same folder.
Checking the filter
Always check your mail filter before you install it. As a general rule, the best bet is to write the filter iteratively, and to run checks on the changes. There are two stages to each check:
- Ensure that every folder mentioned exists, and that the mail system can see it (see section General precautions).
- Similarly, care is needed when you use the save command; always ensure the file exists (as an empty file) before you enable the filter. (The mail system will create your $HOME/.mail file for you, before you even arrive, but it won’t create a file in any other directory.)
- Check the syntax of the filter with the cl-ckfilter tool. The tool checks the syntax of the filter file in $HOME/.forward-new, and reports some details of what it would do in a “normal” case.
Note that the line “Filtering did not set up a significant delivery.” is OK; the next line “Normal delivery will occur.” is to be read as “therefore …”. (The filtering message has an implicit command:
if not delivered then save .mail endif
at its end.)
Once the check is complete (not reporting errors), it’s safe to:
cp ~/.forward ~/.forward-save cp ~/.forward-new ~/.forward
And finally, get someone else send you a message. If it bounces, copy .forward_save back to .forward as soon as possible!
It’s easy to imagine that this testing verges on the paranoiac: however, when the alternative is the potential seriously to delay mail, or even to lose it, thorough testing is well worth while.
Something that’s not always obvious to people, is that mail delivery happens on a central machine that runs the mail server. Even if you always use the same machine, the mail delivery system can’t touch disc space on that machine. So filter instructions like:
will cause mail to back up in the server, and you’ll eventually hear from an upset systems administrator. The mail server machine just can’t see your attractive looking scratch space. A reasonable alternative is to apply for an allocation on /anfs/bigdisc, and to save your spam there.
The same problem may arise from private servers. Some research groups run /usr/groups directories on group servers; some scratch and other machine-local directories are available via NFS (which in effect makes such machines private servers, too). Save mail to disc on non-central servers such as these, and once again, you’re in danger of systems administrator enragement — systems administrators reluctantly accept the blame when a central system goes down, but when you crash your own workstation and mail backs up, there’s no (recognised) excuse.