On Sun, Jan 16, 2005 at 12:25:07PM -0800, Jefferson Cowart wrote:
> Just to chime in a note on this topic. The default scores (as distributed by
> upstream) and dynamically generated by having the rules analyze corpuses on
> known spam and known ham. Based on which rules match on the SPAM and on the
> HAM the scores are computed in a way to minimize the number of false
> positives and false negatives. This means that if a score ends up with a
> high positive score that it is a good indicator for SPAM (at least based on
> the corpus it was run against). The opposite is true about large negative
> scores and HAM.
> 
> For more information check out
> http://wiki.apache.org/spamassassin/HowScoresAreAssigned

Thank you, Jefferson.

Mathieu,

I hope this will be my last message to this bug report.

Please visit the link above.

Furthermore, I want to show you the following statistics from
/usr/share/doc/spamassassin/rules/STATISTICS-set1.txt.gz (set 1 is no
Bayes, with network tests)

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
 766289   506205   260084    0.661   0.00    0.00  (all messages)
100.000  66.0593  33.9407    0.661   0.00    0.00  (all messages as %)
 30.961  46.6673   0.3922    0.992   0.48    0.14  RCVD_IN_SORBS_DUL
 31.331  47.2168   0.4133    0.991   0.47    1.66  RCVD_IN_NJABL_DUL

RCVD_IN_SORBS_DUL hits 47% of spam messages and 0.4% of ham messages
(based on our test Corpora). When RCVD_IN_NJABL_DUL is hit, there is
statistically a 99.1% chance that it is spam.

The perceptron assigned it a score of 1.7, and there's absolutely no
way I'm going to change it in the default Debian distribution.

We're not discriminating, we're just using statistics. We don't make
these scores up off the top of our heads.

If you feel obliged to respond to this mail, don't hold your breath
for a response, you likely won't get one. You have the right to appeal
to the Technical Committee if you so desire.

I'm done dicussing this.

-- 
Duncan Findlay

Attachment: signature.asc
Description: Digital signature

Reply via email to