| « Simon Pegg is doing awfully well | Installed B2Evolution » |
Training SpamAssassin, Horde, IMAP, Plesk user configuration
August 30th, 2007What a nightmare! It has taken me about a month of web searching, playing, testing and configuring to get user specific spam training with SpamAssassin running on an Horde IMAP email client on a Plesk virtual hosts system.
The problem with the default set up is that in order to train SpamAssassin you need to log into the server’s plesk interface and select emails which are spam and which are ham. From that interface you can’t read the emails, so you never know for sure. Also this interface will only let you scan the inbox. This whole process does not work well for users who are using Horde with IMAP.
Bet you want to know how? There are several issues which need to be addressed in order to make it work properly.
- Horde application needs to fill in the correct SpamAssassin variables
- Horde needs to know what to do with the spam or ham emails
- Web user is different from the SpamAssassin user so the difference needs to be resolved.
Firstly Horde needs to fill in the correct variables when parsing the spam / notspam command program. In file /usr/share/psa-horde/imp/lib/Spam.php make the following changes:
PHP:
/* If a (not)spam reporting program has been provided, use | |
* it. */ | |
if (!empty($GLOBALS['conf'][$action]['program'])) { | |
$raw_msg = $imp_contents->fullMessageText(); | |
/* Use a pipe to write the message contents. This should | |
* be secure. */ | |
$email_address = explode("@", Auth::getAuth()); | |
| |
$prog = str_replace('%u', escapeshellarg(Auth::getAuth()), $GLOBALS['conf'][$action]['program']); | |
$prog = str_replace('%l', escapeshellarg($email_address[0]), $prog); | |
$prog = str_replace('%d', escapeshellarg($email_address[1]), $prog); | |
$proc = proc_open($prog, |
Next tell Horde what to do with the spam / notspam emails. Edit the configuration file (/usr/share/psa-horde/imp/config/conf.php) to tell Horde how to do the training, and when to show the “Report as Spam” and “Report as Innocent” links.
Add the following lines:
PHP:
$conf['spam']['reporting'] = true; | |
$conf['notspam']['reporting'] = true; | |
$conf['spam']['program'] = '/var/qmail/popuser/bin/saver.sh %l %d spam > /dev/null 2> /dev/null'; | |
$conf['notspam']['program'] = '/var/qmail/popuser/bin/saver.sh %l %d ham > /dev/null 2> /dev/null'; |
The reporting variables tell Horde to print the report links on every page and the program variables tell it what to do.
The popuser’s home directory does not exist by default so we will create it which will give us somewhere to house all the required programs.
Code:
mkdir /var/qmail/popuser | |
cd /var/qmail/popuser | |
mkdir bin train | |
chown popuser:popuser bin train | |
chmod 755 bin | |
chmod 703 train |
The training directory must be closed, we are going to store copies of emails while they are waiting to be processed and we don’t what prying eyes looking over them.
Now we must recreate the saver.sh and trainer.sh programs, to handle the different user permissions.
saver.sh:
Code:
#!/bin/bash | |
# | |
# Author: David Newcomb | |
# Copyright: BigSoft Limited (c) 2007 | |
# | |
| |
| |
# Handle parameters | |
USER=$1 | |
DOMAIN=$2 | |
WHAT=$3 | |
| |
TRAIN_DIR="/var/qmail/popuser/train" | |
DATE=`date +'%Y%m%d%H%M%S%N'` | |
UNIQFILE="$USER:$DOMAIN:$WHAT:$DATE.$$" | |
FILE="$TRAIN_DIR/$UNIQFILE" | |
| |
cat > $FILE | |
| |
# Mark as ready to pick up | |
chmod o+r $FILE | |
mv "$FILE" "$FILE.done" |
This simple program takes input from stdin and writes it into a special file in a special directory.
trainer.sh:
Code:
#!/bin/bash | |
# | |
# Author: David Newcomb | |
# Copyright: BigSoft Limited (c) 2007 | |
# | |
| |
| |
TRAINER_DIR=/var/qmail/popuser/train | |
SPAM="/usr/bin/sa-learn -u %u --dbpath /var/qmail/mailnames/%d/%l/.spamassassin -L --spam" | |
HAM="/usr/bin/sa-learn -u %u --dbpath /var/qmail/mailnames/%d/%l/.spamassassin -L --ham" | |
PATH=/bin:$PATH | |
| |
ls $TRAINER_DIR/*.done | \ | |
while read FILENAME | |
do | |
`echo "$FILENAME" | sed 's/.*\/\(.*\):\(.*\):\(.*\):\(.*\)/export USER=\1 DOMAIN=\2 WHAT=\3/'` | |
| |
if [ "$WHAT" = "spam" ] | |
then | |
DO=$SPAM | |
else | |
DO=$HAM | |
fi | |
| |
PARSED=`echo $DO | sed "s/%u/$USER@$DOMAIN/g"` | |
PARSED=`echo $PARSED | sed "s/%l/$USER/g"` | |
PARSED=`echo $PARSED | sed "s/%d/$DOMAIN/g"` | |
| |
cat $FILENAME | $PARSED | |
rm -f $FILENAME | |
| |
done |
This program reads the special directory and decodes the special filenames into user, domain and what its contents are. It then runs the sa-learn program pointing it at the specific users bayes_toks files.
The saver.sh is run by the apache webmail process and the trainer.sh is run by the popuser. One could tell apache to run webmail.domain under popuser but I want to avoid touching the webserver’s configuration.
The question now is how often do you run the trainer. This will depend on the various resource requirements your system has, how many mail users you have, how much mail they receive and how much free disk space you have. For example if you have a small number of users then every hour would be enough, whereas if the server is heavily used during the day you may want to run the trainer during one of the more quieter times.
To add to the popuser’s cron enter the following:
Code:
crontab -u popuser -e |
When inside the editor, enter the line:
Code:
0 * * * * /var/qmail/popuser/bin/trainer.sh >/dev/null 2> /dev/null |
This will run the trainer as the popuser every hour.
In all the cases above the output is directed to /dev/null but for debug you can change it to whatever you like.
There is another method that can be used. This is to set up a special user to whom you forward your spam / ham to. I think that this is a pain for users because you have to forward it and then delete it, where as both of these can be done using the above method.
It seems strange that there is no documentation for this feature as I think effective spam training is essential when administering a mail server. I understand that everyone loves coding, but hates writing documentation!
If anyone has any comments, suggestions or improvements to any of the above then please add to the comments.
30 comments
Btw, I run FC6 with IMP and it works great. The only additions I have is to ensure you place the saver.sh and trainer.sh in the bin folder and
$ chmod popuser:popuser saver.sh trainer.sh
$ chmod +x saver.sh trainer.sh
Otherwise, fantastic work...you should offer this how-to up to the Horde-IMP folks...
Now I can let my users themselves report their spam as well.
That is fantastic
I tried adding myself as an administrator, and changing it from there, but it doesn't manage to save either, but suggests I edit the file manually. If I copy the suggested text into the conf.php file, it still doesn't appear.
Any thoughts?
Matt
You state:
/usr/share/psa-horde/config/conf.phpbut the article states the IMP config file:
/usr/share/psa-horde/imp/config/conf.phpTry adding the options to that instead.
Means that we can finally get our users to sort out their own spam settings a bit, rather than having to try and do them for them within Plesk (a step too far for many), which requires a bit of guesswork anyway.
Thanks for the walk-through, I had given up trying to sort something out myself.
First command should be:
$ chown popuser:popuser saver.sh trainer.sh
Matt
Adam - Why not try something that blocks email rather than accepting and filtering it?
Also there are some companies that go through their captured spam to check for false positives, which blocking it would prevent.
A fatal error has occurred
Failed to import Horde configuration: Strict Standards: Non-static method Horde::getTempDir() should not be called statically, assuming $this from incompatible context in /etc/psa-horde/horde/conf.php on line 81 Strict Standards: Non-static method Util::getTempDir() should not be called statically, assuming $this from incompatible context in /usr/share/psa-horde/lib/Horde.php on line 987
Strict Standards: Assigning the return value of new by reference is deprecated in /usr/share/psa-pear/PEAR.php on line 563
Where do i find this: /usr/share/psa-horde/ in the Plesk Panel.. I have no clue..Please Help!!
Edit the configuration file (/usr/share/psa-horde/imp/config/conf.php) to tell Horde how to do the training, and when to show the “Report as Spam” and “Report as Innocent” links.
But where should this “Report as Spam” and “Report as Innocent” links be displayed in Horde?
Delete | Blacklist | Whitelist | Forward | Report as Spam | Report as Innocent | View Messages
and on the view message page:
Delete | Reply | Forward | Redirect | View Thread | Blacklist | Whitelist | Message Source | Save as | Print | Report as Spam | Report as Innocent
Thanks,
Mark Foster
Thanks.
http://git.horde.org/horde/-/browse/sam/
It's kind of old, and I'm wondering if it's still a viable path for Horde-SpamAssassin integration.
thx for this great tutorial, this is exactly what I was looking for.
Now I got some Problems getting this running.
When I try to run the trainer.sh as popuser or root it says:
# sudo -u popuser ./trainer.sh
ls: cannot access /var/qmail/popuser/train/*.done: No such file or directory
How can I test this if it's working correct?
I already had the "mark as spam" links and everything available in horde so i see no difference compared to before. It just does nothing...
Should't the mails i mark as spam go the train folder somehow? the filesystem didnt change since creating the folders as described above.
So looks like it's not working huh?
My Plesk Install is customized to hell. I use procmail to write Messages marked as ***SPAM*** by Spamassassin to a Spam-Folder in each user dir. (Method described on huschi.net). Does this generate any potential problems?
When you check messages and click "mark as spam" the messages are piped into the train folder via saver.sh and given a special filename which records the username/ham/etc. The email is then moved into the Trash folder or deleted depending on your configuration.
The cron runs the trainer.sh program which runs sa-learn for each email and deletes it.
You can tell that it is working because files appear in the train folder then disappear.
If you have customised the installation then you should be able to trace what is happening yourself because you are basically on your own!
It looks like the saver.sh isn't triggered from horde UI. When I try to run it manually, it creates those encrypted filenames inside the train dir but theyre not processed until the filename gets the .done ending. When I use bash -x it seems to stop at "cat" and then exit. The encrypted files remain without .done at the end.
But after your last reply I understand what should happen and how this should be working so I think I'll get this working somehow and then report back with what the problem was.
Thanks alot!
This is what I get from horde.log
Dec 14 13:11:52 HORDE [error] [imp] Error reporting spam: sh: /saver.sh: not found
[on line 114 of "/usr/share/psa-horde/imp/lib/Spam.php"]
Dec 14 13:15:58 HORDE [error] [imp] Error reporting spam: sh: /saver.sh: not found
[on line 114 of "/usr/share/psa-horde/imp/lib/Spam.php"]
Dec 14 13:51:55 HORDE [error] [imp] Error reporting spam: sh: /saver.sh: not found
[on line 114 of "/usr/share/psa-horde/imp/lib/Spam.php"]
Dec 14 13:53:14 HORDE [error] [imp] Error reporting spam: sh: /saver.sh: not found
[on line 114 of "/usr/share/psa-horde/imp/lib/Spam.php"]
No matter what I enter as a external spam program, the horde log always shows that it couldn't find it.
If I use just 'ls' or 'top' instead of calling the saver.sh it tells me that '/ls' or '/top' couldn't be found.
It aways adds a '/' before the script or program. Seems like a bug to me.
When I give it '/var/qmail/popuser/bin saver.sh'
the log tells me:
Dec 14 15:15:02 HORDE [error] [imp] Error reporting spam: sh: /bin: Permission denied
[on line 114 of "/usr/share/psa-horde/imp/lib/Spam.php"]
Seems there's something messed up with the ability to launch external programs.
I did a fresh install of imp before so I dindn't change anything that could cause this.
Any ideas on this?
The issue with "sh /command.sh" looks like a new imp security "feature". Plesk 9.5.3 runs psa-imp-4.3.6 on horde-3.3.6 so not sure what you are running if you have freshly installed it. Anyway it looks like it might be chrooting it, so you could try moving the saver.sh and trainer.sh to the home directory instead of the bin folder. Otherwise you'll have to trace through the php code to find out what it's doing.
In mine the lines around 114 are dealing with setting up the pipe to run the command, so you ought to check that the command isn't being manipulated before it's being run.
I didn't find a way to fix the imp action trigger yet and I'm still waiting for an answer in the official support forum regarding this strange behavior.
But after all your solution works great! Thx alot for this!
My next issue is that ingo's filters do not work. Do you know a good tutorial / howto for configuring any kind of webmail vacation functionality under a plesk / horde setup?
Thanks for the useful post!
I tried to install this and I did everything, but I can't see the links "Report as spam" or "Report as innocent". I did edit the file
/usr/share/psa-horde/imp/config/conf.php,so I'm not sure what's wrong. Any advice? I'm on Plesk 8.3
Thank you,
Rob
Kind of stupid, but maybe I forgot to actually enable the Spam Filter from Plesk on the account that I using to test the script.

