Earl Hood wrote:
On February 20, 2003 at 11:58, Jeff Breidenbach wrote:


It's a constant battle against the spambots. We do a lot of
anti-spambot stuff already, but there are a lot of spam miscreants out
there. Now that MHonArc 2.6 is out I will probably switch to (succumb
to?) completely censoring the email addresses appearing in message
bodies, i.e. replace them with something like [EMAIL PROTECTED].  I'll
probably give the new MHonArc a few more weeks to cut its teeth in the
wild before switching over.

Before switching, you may want to review htdig's capabilities about
character encodings and its handling of character entity references.
This way you will know if you will have to customize some MHonArc
resources so the HTML generated is friendly with htdig.

Note, this issue mainly applies to the non-English archives that
you host.  Ideally, it would be nice to leverage the UTF-8 support
in MHonArc 2.6 and convert all message to UTF-8 when archived.
Unfortunately, I do not think htdig supports UTF-8.
BTW ASPseek does (and internally stores all data in UTF-8, so it's not a problem to have many different encodings in one DB, including even CJK ideographs).

I suspect* if you will correctly specify charset (by having, say

<HEADER>
...
<META NAME="Content-Type" CONTENTS="text/html; charset=windows-1251">

in HTML document, ht://Dig will understand it and handle the document correctly.

* DISCLAIMER: I'm not a ht://Dig expert, but rather ASPseek guru :)

--
== kir_at_asplinux.ru == 7551596_at_ICQ == 6722750_at_sms.beemail.ru ==

Stuckness shouldn't be avoided. It's the psychic predecessor of all
real understanding. An egoless acceptance of stuckness is a key to an
understanding of all Quality, in mechanical work as in other endeavors.
-- R. Pirsig, "Zen and the Art of Motorcycle Maintenance"


_______________________________________________
Gossip mailing list
[EMAIL PROTECTED]
http://jab.org/cgi-bin/mailman/listinfo/gossip

Reply via email to