B> I think it is perfectly reasonable to treat all characters the same in B> this regard. You might desire some different behavior e.g. if you edit B> western text with variable width fonts, so 'l' takes much less space B> than 'W', but such behavior is a feature request, don't you think?
No no no! All Chinese characters are two ASCII characters wide upon display. All ASCII characters are one ASCII character wide upon display. That is all I am asking for. Probably related to: wcswidth (3) - determine columns needed for a fixed-size wide character string The challenge: download e.g., http://jidanni.org/me/index.html . Note how it fits nicely in your 80 column wide UTF-8 capable editor. Now dare to run tidy -utf8 on it. Disaster. Many lines now go way off the edge of the editor. That's why I invented dantidy, below. B> I see a feature request that ... -raw Forget about -raw. The problem occurs with just -utf8. LC_ALL=... doesn't help too. Anyway, please just count each Chinese character as two ASCII characters when determining if you have reached the wrap column or not. An _unrelated_ issue is how many bytes a Chinese character uses in a file on your disk. In UTF-8 it uses 3. In big5 it uses 2. Don't get fooled by that! $ cat dantidy #what I am forced to use #!/bin/sh -e perl -C -pwe 's/[^[:ascii:]]/sprintf "\\x{%04x}",ord $&/ge'|#my uni2ascii(1) tidy --gnu-emacs yes --indent-spaces 1 --indent auto -utf8 \ --tidy-mark no --wrap-attributes yes --wrap-script-literals yes -quiet| perl -C -pwe 's/\\x\{([[:xdigit:]]{4})\}/chr eval "0x$1"/eg' #my ascii2uni(1) -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]