> The new mhonarc config should do charset conversion if possible, > or just output the text as-is in the case charset of the mail is utf8 > or unknown.
It's not that simple. Leaving a '<' character can cause security issues. Anyway, the relevant portion of the mhonarc manual is the <CHARSETCONVERTERS> resource. Take a look at mhonac::htmlize versus MHonArc::CharEnt::str2sgml and possibly discuss this on the upstream mailing list. http://www.mhonarc.org/MHonArc/doc/resources/charsetconverters.html However, I personally recommend that mhonarc be set to convert everything to UTF-8, no exceptions. That simplifies a lot of things, including the use of mixed languages in a single message. Mixed language index pages. Easier linguistic analysis and data mining of the HTML. Etc. Bending over backwards for incorrectly labelled character sets on inbound email seems more trouble than it is worth. Incidentally, I was probably put on CC: because I'm the mhonarc package maintainer. But I should also mention that one of my other hats is helping run mail-archive.com, which provides secondary archival service for all Debian mailing lists, with permission of the (former) DPL. The service is also available for any other Debian team or group currently wrestling with mhonarc configuration. So that is a possible fallback if needed. Cheers, Jeff