JNRowe

mailfilter snippets for killing spam

Some simple mailfilter snippets for killing spam on the server

mailfilter is an excellent tool for killing mail that is sitting on a POP server before you even have to download it all. It works by examining the mail headers and allows you to create rules to match certain patterns.

Warning

mailfilter is a great tool, but it does delete mail so be careful when configuring it. And always test it first with the -t/--test argument or by adding TEST = "yes" to your config file.

First up set up a list of allowed users(friends, work colleagues, etc) who should always escape the filtering no matter whether they decide to use a spammy subject or send mail from known spammy domains. Add each one to your config file before you start, and don't forget to enable extended RegExp if you want to use the fancier syntax match options.

I use abook to manage my personal addresses, and I simply create my friends allow list directly from its database. The following, incredibly ugly, Python snippet generates mailfilter entries from the ~/.abook/addressbook file. You can write the output to a file and tell mailfilter to include it by adding INCLUDE = path_to_output_file to your .mailfilterrc.

from os.path import expanduser

for line in open(expanduser("~/.abook/addressbook"), "rb"):
    # Only process lines containing an email address
    if not line.startswith("email"): continue
    # Scrub the newline, only take the actual address data
    addresses = line[:-1].split("=")[1].split(",")
    for address in addresses:
        # Escape the '.' character because mailfilter uses RegExps
        print 'ALLOW = "^From: .*%s"' % address.replace(".", "\\.")

I use an SCM commit hook to update these generated files. Mercurial, the SCM I use to manage my home directory, is written in Python. And because of this Python is already cache hot when the updates occur, but if it wasn't I'd probably suggest using sed or awk instead.

That being said if you're generating a whole bunch of files, like me, it may be faster and lighter to use a Python or Ruby script which processes all of them then it is to call awk separately on each.

I can pretty safely guarantee you're going to need to automate your whitelisting tasks, because sooner or later you'll forget to update the data in one place or another if you manually try to keep the files in sync.

Important

The following filters reflect mail that I receive don't copy them blindly, basically if you don't understand what they do ignore them.

The first filter I setup is on spammy from headers:

DENY = "^From: .*@mail\.ru"
DENY = "^From: .*@(aol|msn|yahoo|hotmail)\.com"
DENY = "^From: .*@thedtn\.com"

# I've never received a non-spam mail from a .biz
DENY = "^From: .*\.biz"

# 4 numbers in the user part is always spam
DENY = "^From: .*[0-9]{4}.*@"

# Numbers at the start of the domain part is always spam
DENY = "^From: .*@[0-9]{2}"

# All caps user part is always spam
DENY_CASE = "^From: .*<[A-Z]*@"

Next it is the UA's that spammers use:

# Spam only user agent
DENY = "^X-Mailer: The Bat"
DENY = "^(User-Agent|X-Mailer): MIME-tools"
DENY = "^(User-Agent|X-Mailer): Calypso"
DENY = "^(User-Agent|X-Mailer): Pegasus"
DENY = "^User-Agent: PObox"

I define mailing lists explicitly in the config file, this is important because it allows me to kill any mail that doesn't have me listed in the To or CC headers. Without explicitly listing mailing lists most of the mail from them would be killed, unless you're an extremely popular CC target on the list. A small selection of the lists config include:

# Mailing lists
ALLOW = "^From: mailman-owner@"
ALLOW = "^X-BeenThere: users@mailman\.ukfsn\.org"

# Allow all Bugzilla mail through
ALLOW = "^From: bugzilla-daemon@"

The other feature of defining mailing lists this way is it allows you to still receive mail from users who use spammy domains to send to lists. In my case it allows me to receive list mail from people who use Yahoo or AOL, whereas outside of mailing lists AOL, for example, shows a 1:6000 chance of being ham in my mail archive.

Once you've set up all your mailing lists you can instruct mailfilter to kill any mail that isn't directly sent to you:

DENY <> "^(To|CC): .*(me|old_me)@example\.com"

The above filter kills an enormous amount of spam mail, but will obviously kill any mail sent to a list if you haven't specifically added an ALLOW rule so you must keep your list definitions up to date!

Return to Top