On Sat, 30 Aug 2003 23:40:13 -0400 Tom Allison <[EMAIL PROTECTED]> wrote: > It may be turned on in the config files, but I am guessing that the code is > skipping the bayesian score contribution until the mail count gets to 200 on > each side (ham/spam).
Right. > I just grabbed a lot of email I had already and fed it into the sa-learn. > I think I have enough now that it is working. You can tell by looking at the headers and seeing if BAYES_xx shows up. The xx is the approx. range that the Bayesian filter places the particular piece of mail. For example here's the score from the message of yours I am responding to: X-Spam-Status: No, hits=-3.6 required=5.0 tests=BAYES_10,NO_REAL_NAME version=2.55 So the Bayesian filter (classifier?) thinks it is 10-??% (forget the upper range) likely to be spam. Ah, here it is. From 23_bayes.cf... body BAYES_10 eval:check_bayes('0.10', '0.20') ...10 to 20% which gives it a score of... score BAYES_10 0 0 -5.300 -4.701 ...-4.701 based on my setup. IIRC first score is if no network checks are enabled, second score is if network checks are enabled. Well, let's see. NO_REAL_NAME nets the message... score NO_REAL_NAME 0.993 0.820 1.137 1.149 ...1.149. -4.7 + 1.1 = -3.6 > I'm not sure, I just kind of fiddled with it a few times in the early hours > and got it working. Yeah, it just takes a little bit to kick in. Once it does the difference is dramatic if you track the scores. Average ham for me is around -3 and average spam is closer to 12 to 15. Affords me a lot of latitude when configuring sa-exim to reject things at SMTP. -- Steve C. Lamb | I'm your priest, I'm your shrink, I'm your PGP Key: 8B6E99C5 | main connection to the switchboard of souls. -------------------------------+---------------------------------------------
pgp00000.pgp
Description: PGP signature