Adam Borowski writes ("UTF-8 in jessie"): > I would like to propose full UTF-8 support. I don't mean here full > support for all of Unicode's finer points, merely complete eradication of > mojibake. That is, ensuring that /m.o/ matches "möo", or that "ä" sorts > as equal to "a""combining ¨" is out of scope of this proposal.
I agree with everything you propose except that I have one reservation regarding this: > 4. all text files should be encoded in UTF-8 I agree with this except that I think it should be permitted that a text file uses ASCII codepoints. You may say "but UTF-8 is a superset of ASCII". Well, no, it isn't. UTF-8 is a superset of ISO-646 but ISO-646 is not identical to ASCII. In particular the descriptions of the codepoints ` ' in ISO-646 effectively forbids them from being used as matching single quotes, despite that being specified as allowed in ASCII. I don't think that better UTF-8 support should involve needlessly converting 7-bit ASCII text files which use ` ' as matched quotes, into UTF-8 text files which use non-ISO-646 codepoints. (In fact I would like to see Markus Kuhn's decision about ` ' reversed - our default character set should be ASCII for 0..127 plus UTF for the rest. That's not an argument I expect to win but at the very least we shouldn't have to worsify things for ASCII users.) Thanks, Ian. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/21022.5425.511942.342...@chiark.greenend.org.uk