Le quintidi 5 floréal, an CCXXIV, Gene Heskett a écrit : > I feed it ham by moving stuff it should catch to the ham directory so its > treated as ham on the next runs of sa-learn. I could add a weekly > sa-learn --ham session, nameing one or more of the cleaner folders from > the mailing lists I suppose. It was initially run over my cleaned up > coco folder, which is several gigabytes of quite clean ham. The initial > run was many years ago. Or, I suppose, I could use an int rnd(number of > good folders) and the a stack of ifelses to do a name substitution based > on the number and have it read them all at random intervals.
AFAIK, the best of training a Bayesian filter is to feed it contents that is typical of what it has to classify, and in particular spam and ham in the same ratio as what you receive. Also, beware if there are some kind of mails that you want to receive but never archive (for example notifications), since you may never teach the filter to recognize them as ham. OTOH, feeding enough ham is necessary to avoid false-positives. This sub-thread was about false-negatives, IIRC. Regards, -- Nicolas George
signature.asc
Description: Digital signature