Michelle Konzack wrote:
> Am 2005-03-21 21:53:45, schrieb Bob Proulx:
> > Package: mutt
> > Version: 1.5.6-20040907+3
> > Severity: normal
> > 
> > When upgrading machines from woody to the current sarge snapshot the
> > change in pipe_decode setting breaks spam processing scripts.  The upstream
> > binary does not set pipe_decode and the woody version did not either.  But
> > the current sarge snapshot sets pipe_decode in /etc/Muttrc.  Please remove
> > this setting as this breaks many scripts for users.
> 
> Can you explain, why pipe_decode breaks "spam processing scripts" ?

With pipe_decode=yes mutt will only pipe the portion of the message
that you have configured to display.  Basically what you see on the
display is what gets piped to the external program.

Specifically mutt will split multipart alternative messages up and
only pipe the configured to be displayed alternative message.  For
most must users I would guess that text/plain parts are the display
format of choice.  That means that text/html parts are not piped to
the extneral program.  Programs reading the piped message will not see
any html portion of the message.  The html portion is usually the spam
payload.  This means that spam filters such as bayes engines cannot
learn properly from the message.  And in fact the bayes engine can get
terribly confused by the "bayes poison" included in many messages if
that is the only portion of the message it is learning.

Additionally mutt filters the headers ignoring whatever has been
configured to be ignored.  Usually all of the Received: headers among
others are weeded by default.  This is another critical piece of
information to spam filters.  Spam filters try to understand the
headers to determine at what point the message entered your network.
With SpamAssassin this is the trusted_networks setting.  It will then
look up the machine that sent the message to your network up in
various RBLs and use that information to help score the message.
Without being able to see those headers it is not possible to use
those header checks.  'sa-learn' from spamassassin also uses the
headers to train the bayes engine and to score spam.

As a mutt customization users may also ignore or unignore additional
headers.  The user may want to display or not any particular header.
This just generally breaks programs trying to make sense of them if
parts of the headers are missing.

Since spam filters need the full message this setting of pipe_decode
breaks them by preventing them from seeing the entire original
message.

> If I unset pipe_decode, I will run into trouble with all of my
> "processing scripts". And 'sa-learn' for exapmle works better
> WITH pipe_decode as without.  Same for 'blacklister'.

I am not familiar with 'blacklister' but I am very familiar with
'sa-learn' from spamassassin.  'sa-learn' needs the verbatim message
in order to train the bayes engine properly.  If it cannot see the
text/html portion of the message then it will not be able to train
properly.  If it cannot see the headers of a message then it cannot
learn the strong indicators of spam and non-spam in the message.  I
know you said it works better for you but I find that very hard to
understand how that could be so.  I think this setting would break
most users of sa-learn.

I am probably missing other uses but the one use that would make sense
for pipe_decode=yes would be to pipe to a printing program.  When
printing your mail you would probably want to print what you see.
However there is a dedicated print-message function in mutt that is
separate from pipe-message.  Therefore there is no need to use
pipe-message for the purposes of printing.

Bob

-- 
Bob Proulx <[EMAIL PROTECTED]>
http://www.proulx.com/~bob/

Attachment: signature.asc
Description: Digital signature

Reply via email to