>>>>> "EM" == Etienne Millon <etienne.mil...@gmail.com> writes:
EM> Can you try to put the following line in your config.py ? EM> CHARSET_LIST='US-ASCII', 'UTF-8', 'BIG5', 'ISO-2022-JP', 'ISO-8859-1' Ahh. It had been so long since I set my current config up, that either I had forgotten about r2e's config.py or my setup predated CHARSET_LIST... I used »CHARSET_LIST='US-ASCII', 'ISO-8859-1', 'UTF-8'« because those will cover all of the feeds which I monitor. There is little reason to have 8859-* after utf-8; it would never fall through to it. But having it ahead of utf-8 can have benefit. Something in my chain forces r2e's 8859-1 to cte:qp and its utf-8 to cte:b64. Given that the former can be read w/o decoding, it is useful to permit its use. I don't know where the CJ encodings should fall in the default set. Having them ahead of utf-8 causes harm for non-asian-language feeds, but having utf-8 first will unify zh and jp text. That is, if the reader's MUA is configured to prefer a jp font for ideographs, then zh text in utf-8 will be rendered with that jp font. And visa-versa if their MUA prefers a zh_{CN,TW,HK} font. I don't know how much harm that would do. I usually can recognize zh_CN vs zh_TW vs jp vs kn text, but cannot actually read any of them.... Were the character-set matching to limit big5 and 2022 to characters which are not used outside Asia, though, they could remain before utf-8. That means that neutral chars -- like the quotes -- should not match the CJK character sets. Only matching characters which have width property W in unicode's EastAsianWidth.txt should do the trick. (The quotes have A, presumably for Ambiguous.) Explicitly configuring it, though, does fix things for me. Thanks. -JimC -- James Cloos <cl...@jhcloos.com> OpenPGP: 1024D/ED7DAEA6 -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org