on Thu, Aug 28, 2003 at 03:09:48PM +0200, Andreas Metzler ([EMAIL PROTECTED]) wrote: > Karsten M. Self <[email protected]> wrote: > [...] > > SpamAssassin achieves a false-positive rate (non-spam reported as spam) > > of 5% with a default threshold of 5. This can be dramatically improved > > using a whitelist, to ~98% in my experience. This is not the best > > performance of all filters, so makes a somewhat generous threshold. > > > http://www.spamassassin.org/dist/rules/STATISTICS.txt > > http://freshmeat.net/articles/view/964/ > > > So a spam-reduction system user would at worst see a typical rate of 2% > > of spam to be manually disposed of. > [...] > > You are mixing up percentages. "5% non-spam reported as spam" ... can > be ... improved to ~98% ...
Correct. And yes, I was thinking "false-negative". Spam not flagged as
spam.
What I meant to say was this:
- Currently feasible content-based filters + whitelists can achieve a
spam rate of 2% of spam passing to the inbox, by independent tests.
- A C-R system should then target having no more than 2% of challenges
sent be misdirected (based on spoofed headers, etc.). At this rate,
it's still transferring burden inappropriately, but at a level that
matches a reasonable-case technological alternative. This also
achieves a secondary goal in the interests of C-R proponents of
keeping the incidence of false challenges low enough that recipients
would be likely to respond to the challenge.
> When I last checked my personal rate with spamassassin 2.55 with
> default rules and no DNS lists or razor (but including a rather well
> trained bayesian filter) and a default threshold of 5, I came up with
> these numbers[1]:
> * 0% false positives, i.e. ham sorted into the spam folder
> * 10% of the spam was not recognized as such and I had to filter it
> out by hand.
I use a whitelisting system. It's based on Lars Wizenius's spamfilter
package, my local add being a shell script to scan messages for sender
to add to white, black, gray, or spam lists. Mail from previously
unknown senders ends up in a "grey" box. The principle is the same as
C-R, except that assessment is done by me, rather than a third party.
Peace.
--
Karsten M. Self <[email protected]> http://kmself.home.netcom.com/
What Part of "Gestalt" don't you understand?
Verio webhosting? Guaranteed downtime:
http://www.wired.com/news/politics/0,1283,57011,00.html
http://www.dowethics.com/r/environment/freedom.html
pgp7SQrlsknKk.pgp
Description: PGP signature

