Re: [Pan-users] Annoying ' in posts

Paul Crawford Sun, 23 Sep 2012 05:35:20 -0700

On 23/09/12 05:03, Steven D'Aprano wrote:

On 23/09/12 04:29, Paul Crawford wrote:

What I hate about unicode was the idea of adopting 16-bit characters and
thus breaking so much byte-orientated code that was written, tested, and
integrated over the history of computing.


You make it sound like the Unicode Consortium hacked into people's
computers
and changed their existing 8-bit ASCII files into 16-bit UCS-2 files. I'm
pretty sure that never happened.

The point I was hoping to make was not to denigrate the desirability ofa single universal character set, but about the specific idea of USC-2representation.

For example, it is (was?) the case that if you wanted to properly usemulti-language support on Windows NT (and later) you had to re-write anyapplication to make use of 16-bit 'wide' character strings, thusbreaking anything written in the past that assumed byte-orientated text.

And that is a *lot* of useful stuff that we are talking about:libraries, applications, storage devices, file compression utilities, etc.

Now you may have a point that the use of byte-orientated andNUL-terminated strings as developed for C/UNIX was possiblyshort-sighted, but in the context of 1960s/70s computing it wasreasonable, quite possibly necessary, to be usably fast on the hardwareof the day.

USC-2 breaks that by going 16-bit wide with NUL upper bytes in mostcommon cases, and it requires a byte-order marker to cope with differingCPU architectures. Both should have been obvious at the time, so I don'tknow why it was adopted in that form.

UTF-8 on the other had allows a universal character set (and one muchbigger than UCS-2) *and* it works with legacy code that relies onbyte-represented text with NUL string terminators and all of thecorresponding stuff built around that.


Regards, Paul

_______________________________________________
Pan-users mailing list
Pan-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/pan-users

Re: [Pan-users] Annoying ' in posts

Reply via email to