The MTA receives incoming email from external sources (usually the mta.york.ac.uk gateways run by the University), runs some checks on it and routes it to the correct user. In this process it handles delivery to mailing lists and aliases using a series of redirect routers. As the last step mail is run through the Dovecot LDA to process user filters and deliver it to user mailboxes.
Due to the withdrawal of the University's spam and virus filtering on their mail relays in October 2012 (as part of the move to Google Apps), Exim operates spam and virus checking on incoming mail, scanning mail using ClamAV and SpamAssassin when it is first received by the server. One of Exim's Access Control Lists (ACLs) is acl_check_data, which is run immediately after a message is received by the server, before the sender has disconnected. This ACL performs the virus and spam scans, and adds the results to the message headers.
Despite the move to Google Apps, we have been assured (like other administrators of legacy email systems in the University) that a small number of MTAs will remain active indefinitely to route our mail.
Exim can be a tricky piece of software to reconfigure, fortunately there are some handy commands to help you. Firstly
exim -bt email@example.com will show how Exim will attempt to route mail, good for checking why a user gets no mail or whether forwarding works properly. This will work for any email address, including the mailing lists, however external addresses will just show the mail routing to the University mail servers.
Another useful one is
exim -bh 127.0.0.1 which is just like telnetting to the server and sending a mail, except it won't actually send an email, just go through all the other motions to do with permissions and spam/virus checks.
I got these from this cheatsheet.
ClamAV is configured to run as a daemon, connecting to Exim through a socket at
/var/run/clamd.exim/clamd.sock using its default configuration. Exim runs the virus scanner on incoming messages via this socket, and any messages are rejected (with a note in the logfile).
Next, Exim connects to the SpamAssassin daemon (spamd) and passes it the message, where it is scanned against the default set of rules. These assign the message spam points based on contents, senders, headers, white/blacklists and Bayesian statistical analysis. The results of this analysis is added to the message in the form of X-Spam-Score and X-Spam-Report headers, where a message with a spam score > 5 is considered spam and has the X-Spam-Flag: YES header added as well. If a message scores >10 it is currently routed to /var/tmp/quarantine and not delivered to the recipient. This will be used to assess whether mail with a score >10 can be safely rejected outright rather than filling up user's mailboxes.
SpamAssassin configuration is in
/etc/mail/spamassassin/local.cf, which overrides a small number of default configuration options. These overrides change the spam report formatting and has a required_hits line which will change the threshold spam score from 5 for all users. The Bayesian database is updated by cron, and the cronjob is in
For a while bayes_journal wasn't updating:
Ok, fixed it, looks like bayes_journal is created by the apache user, with group apache and permissions 770, but spamd couldn't write to it as it wasn't a member of apache; I think when I first set it up spamd ended up being the owner of that file, but at some point it got deleted and recreated by apache. Solution: add spamd to the apache group:
usermod -aG apache spamd- Sam Nicholson, 21 January 2013
Same issues as before with /data/spamassassin/bayes_journal but due to various server changes the fix no longer works (no more apache). Temporary solution is to change the folder's permissions to be owned by the spamassassin user trying to access it - but this gets reset every so often.
chown -Rv mail bayes_*- Connor Sanders, 03 August 2021