Both SpamAssassin and amavisd-new are Perl programs, and amavisd-new includes SpamaAssassins libraries so it doesn't need SpamAssassin daemon running on the server. We are going to turn off the daemon and prevent it from starting up during boot.
/etc/init.d/spamassassin stop update-rc.d -f spamassassin remove etckeeper commit "Removed spamassassin from rcX.d"
To enable DKIM checking of received emails in SpamAssassin one has to install Mail::DKIM Perl library.
apt-get install libmail-dkim-perl
Edit /etc/spamassassin/v312.pre
and check that
this line is uncommented:
loadplugin Mail::SpamAssassin::Plugin::DKIM
We are also going to install Pyzor and Razor for additional checks.
apt-get install pyzor razor
After that you can try running SpamAssassing manually:
spamassassin -D -t < /usr/share/doc/spamassassin/examples/sample-spam.txt 2>&1 | tee sa.out
You should see DKIM mentioned in the sa.out file, and the end of the output should look something like this:
Content analysis details: (1004.5 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 NO_RELAYS Informational: message was not relayed via SMTP 1000 GTUBE BODY: Generic Test for Unsolicited Bulk Email 0.4 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50% [cf: 100] 0.5 RAZOR2_CF_RANGE_E4_51_100 Razor2 gives engine 4 confidence level above 50% [cf: 100] 1.7 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/) 2.0 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/) 0.0 DIGEST_MULTIPLE Message hits more than one network digest check -0.0 NO_RECEIVED Informational: message has no Received headers
To enable AWL edit /etc/spamassassin/v310.pre
and uncomment:
loadplugin Mail::SpamAssassin::Plugin::AWL
Edit /etc/spamassassin/local.cf and add:
use_auto_whitelist 1
Bayes filtering is a strong weapon for fighting spam. It works by learning what is spam to you and what isn't. For SpamAssassing to start using Bayes filtering you have to train it first. Training your Bayes filters is something that you should do on a regular basis. The more emails it process the smarter it gets.
To learn SpamAssassin what is spam, you have to use the sa-learn
utillity on a folder where your spam messages are stored (in my case the
folder is called Junk
).
Because SpamAssassin is run by amavisd-new you have to run the sa-learn utility as the amavis user.
su amavis -c 'sa-learn --no-sync --spam /home/vmail/example.com/demo/.Junk/cur'
To learn what is not spam run sa-learn in the folder that only
contains your non-spam mail (in this case, sa-learn examines the
Inbox
folder).
su amavis -c 'sa-learn --no-sync --ham /home/vmail/example.com/demo/cur'
Bayes filtering will be used once you train SpamAssassin on more than 200 spam and ham messages.
To update SpamAssassin you can run:
sa-update -D
-D
is for debug.
Enter mysql and create database table and user for SpamAssassin.
CREATE DATABASE mail_spamassassin; CREATE USER 'spamassassin'@'localhost' IDENTIFIED BY 'new_password'; GRANT ALL PRIVILEGES ON `mail_spamassassin` . * TO 'spamassassin'@'localhost'; FLUSH PRIVILEGES;
Usw wget to download scripts that we will need from http://spamassassin.apache.org/full/3.0.x/dist/tools/.
cd /root wget http://spamassassin.apache.org/full/3.0.x/dist/tools/convert_awl_dbm_to_sql
To create the tables in MySQL run:
mysql -u root -p mail_spamassassin < /usr/share/doc/spamassassin/sql/bayes_mysql.sql mysql -u root -p mail_spamassassin < /usr/share/doc/spamassassin/sql/awl_mysql.sql
Edit /etc/spamassassin/local.cf and add this at the end:
bayes_store_module Mail::SpamAssassin::BayesStore::MySQL bayes_sql_dsn DBI:mysql:mail_spamassassin:localhost bayes_sql_username spamassassin bayes_sql_password new_password bayes_sql_override_username amavis auto_whitelist_factory Mail::SpamAssassin::SQLBasedAddrList user_awl_dsn DBI:mysql:mail_spamassassin:localhost user_awl_sql_username spamassassin user_awl_sql_password new_password
Now we need to initialise the database table:
su amavis -c 'sa-learn --spam /usr/share/doc/spamassassin/examples/sample-spam.txt'
If you are starting clean and do not have existing Bayes data you can skip importing of existing data into the database.
Move to the amavis users home folder and dump the existing data:
cd /var/lib/amavis/.spamassassin su amavis -c 'sa-learn --sync --force-expire su amavis -c 'sa-learn --backup > /root/backup.txt
To convert the AWL data:
cd /root chmod +x convert_awl_dbm_to_sql ./convert_awl_dbm_to_sql
To run the actuall convert copy this line and replace your password and other data you might have modified:
./convert_awl_dbm_to_sql --username amavis --dsn DBI:mysql:mail_spamassassin:localhost --dbautowhitelist /var/lib/amavis/.spamassassin/auto-whitelist --sqlusername spamassassin --sqlpassword new_password --ok
To insert the Bayes data run:
su amavis -c 'sa-learn --restore backup.txt'
The existing data should be inserted into MySQL.
Restart amavisd-new and commit the changes made.
/etc/init.d/amavis restart etckeeper commit "Moved Bayes and AWL data to MySQL"
If you have enough Bayes and AWL data you can test SpamAssassin like this:
atlantis:~# su amavis -c "spamassassin -D -t < /usr/share/doc/spamassassin/examples/sample-spam.txt 2>&1 | egrep '(bayes:|whitelist:|AWL)'" [26387] dbg: plugin: loading Mail::SpamAssassin::Plugin::AWL from @INC [26387] dbg: bayes: using username: amavis [26387] dbg: bayes: database connection established [26387] dbg: bayes: found bayes db version 3 [26387] dbg: bayes: Using userid: 1 [26387] dbg: bayes: corpus size: nspam = 27443, nham = 7248 [26387] dbg: bayes: tok_get_all: token count: 65 [26387] dbg: bayes: score = 0.25382662007679 [26387] dbg: bayes: DB expiry: tokens in DB: 116828, Expiry max size: 150000, Oldest atime: 1287200200, Newest atime: 1298305866, Last expire: 1298265907, Current time: 1298310696 [26387] dbg: auto-whitelist: sql-based connected to DBI:mysql:mail_spamassassin:localhost [26387] dbg: auto-whitelist: sql-based using username: amavis [26387] dbg: auto-whitelist: sql-based get_addr_entry: no entry found for sender@example.net|ip=none [26387] dbg: auto-whitelist: sql-based sender@example.net|ip=none scores 0/0 [26387] dbg: auto-whitelist: AWL active, pre-score: 1006.014, autolearn score: 6.014, mean: undef, IP: undef [26387] dbg: auto-whitelist: sql-based add_score: created new entry for sender@example.net|ip=none with totscore: 6.014 [26387] dbg: auto-whitelist: sql-based finish: disconnected from DBI:mysql:spamassassin:localhost [26387] dbg: auto-whitelist: post auto-whitelist score: 1006.014
Warning | |
---|---|
Some of the data from the database should be pruned regularly. For more information on how to create scripts that will automatically prune your database from stale records take a look here. |