|
Spam handling
Default spam handling
-
All email to @physics.cornell.edu is passed through
dspam which only tags
emails, i.e., never quarantines; it is up to the user to filter
based on the tags. If dspam thinks that a certain message is spam, it
- Attaches a prefix
[SPAM] to the subject
- Adds a header
X-DSPAM-Result: Spam to the
email. It is advisable to use this header rather than the subject line for filtering.
If you're using maildrop to filter tagged messages, the following rule in
~/.mailfilter should filter tagged messages into a folder called .IN-Spam
if (/^X-DSPAM-Result: Spam/)
{
to "$HOME/Mail/IN-box/.IN-Spam/"
}
Of course, you'll have to create the folder first with maildirmake
~/Mail/IN-box/.IN-Spam
Training dspam
-
Sometimes dspam will tag spam as innocent email, or normal email as spam. That is not
unusual, and it simply means that dspam needs to know what you think is spam, so
that it can filter properly in the future. Training occurs by bouncing or forwarding
emails to the appropriate email address. Since dspam went in effect from May 1st, the
training method depends on whether you received the email before or after that date.
- If the email came before 1st May 2007, was a
spam, and you want to train dspam to recognize similar
emails as spam, bounce or forward it to
old_spam_at_physics.cornell.edu
- If the email came after 1st May 2007, was a spam, and
was not tagged by dspam as spam, bounce or forward it to
spam_at_physics.cornell.edu
- If the email came before 1st May 2007, was not a spam, and you want to train dspam to recognize similar emails
as "innocent", forward or bounce it to
old_ham_at_physics.cornell.edu
- If the email came after 1st May 2007, was not a spam, and was tagged incorrectly
by dspam as
[SPAM], forward or bounce it to
ham_at_physics.cornell.edu
Try to keep the total number of sample spams and sample hams you send to dspam roughly
the same. After a few hundred of these, dspam should hopefully begin to be accurate
enough to make a difference.
Why is training not kicking in?
-
Training dspam is a complex process. In particular, if you feed it a lot of spam (false negatives) without
feeding it enough hams (either false or true negatives), the spam
tagging will deteriorate, i.e., it will allow more spams through to
prevent false positives (since it will not have a good idea
what true ham messages look like). So always feed it a bunch of hams
along with the spams. Also, after 2,500 hams, something called statistical sedation kicks off, so that tagging is
more aggressive. This is the training threshold. Features like Bayesian noise reduction only kick in after the
training period.
I already have a huge collection of spams and
non-spams from Spamassassin. How do I use that to train dspam?
Put all the spams and non-spams in two separate maildirs
(you probably already have that), and send the info to
help_at_physics.cornell.edu
|
|