Jonathan Matthews wrote: > Quick Spamassassin question: > > I've got SpamAssassin 2.43 installed, and it's working well. > > However, I noticed the two lines quoted below in the altered body of > some spam that it caught recently: > > SPAM: EXCUSE_16 (-0.3 points) BODY: I wonder how many emails they sent in > error... > SPAM: EXCUSE_14 (-0.2 points) BODY: Tells you how to stop further spam > > It seems strange to me that these two reasons should /decrease/ the > probability of the email being spam. > > I know that the weightings attached to different rules are > user-definable, so I'm not asking "how do I stop this behaviour" - I can > easily go and redefine the weights. > > I'd just like to get some confirmation that these weightings are wrong. > It's the stock install of SpamAssassin in testing, with no alterations > made to the config at all. Should I file a bug, change my own > weightings or go away in shame, having made a fool of myself publicly?
The deal is that spamassassin's scores are generated using a genetic algorithm. They "breed" scores against a corpus of known spam and non-spam, starting with random scores and mutating them up or down, then seeing how that does and letting the winning mutations thrive. The aim is to get as few false positives as possible while still catching as much spam as possible of course. So the scores are not something hand-tweaked by a human. What happens sometimes is it seems that making a score negative reduces the number of false positives, while not catching any less spam, at least in their body of spam. And the SA guys, rightly or wrongly, trust their GA to get it right, and leave these negtive scores in. I have mixed feelings about this, but it seems to work. -- see shy jo
pgp00000.pgp
Description: PGP signature