On 6/13/17 7:58 PM, L A Walsh wrote: > > > Chet Ramey wrote: >> On 6/2/17 6:23 PM, L A Walsh wrote: >> >> >>> As for unsupported systems, there is a reason they are no longer >>> supported. The world is already using UTF-8. It's only a few >>> luddites clinging to ascii as a last refuge. ;-) >>> >>> What display/OS do you have that you can't run UTF-8 on? >>> >> >> This is a red herring. de_DE.UTF-8 and zh_KH.UTF-8 don't use the same >> character set. >> > --- > The use the same encoding. Whether or not they use > the same character set is up to someone's preference. Looking at my > local fonts, it looks like 'Code2000' covers both of those ranges: > Germany and Khmer ? But using 1 font for both isn't necessary.
That's not relevant to the issue of whether or not a particular character is classified as alphabetic in one locale and not another. That is the largest issue with locale-specific identifiers. I'm not as concerned with how a particular character displays; that's not important to interpreting the script. > Forgive me if I'm misremembering, but hasn't Greg argued against > the ability to supply "libraries" of re-usable scripts due to > the ease with which names could conflict with each other and cause > script incompatibilities? I'm sure he has. It's a genuine problem without namespaces, so you have to adopt some naming convention that provides pseudo-namespace functionality. > If it is the case that script libraries had access to unicode > var & func names (and used it), wouldn't that significantly > decrease the the chances of conflict? Right now, what, ... > maybe A-Za-z_0-9 + maybe a few others == that's about 64 chars? 63 x as many characters are in your identifier. You can easily choose some prefix (like readline uses _rl_ and rl_) and reduce the potential for clashes. > Even if a character doesn't display in your locality, doesn't > mean it wouldn't work -- i.e. if I don't have a Cryllic font > installed, that doesn't mean the script wouldn't work -- as > the characters would still be encoded as their Unicode values. A character that is classified as being a valid alphabetic in one locale may not be such in another, regardless of its encoding. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU c...@case.edu http://cnswww.cns.cwru.edu/~chet/