On 6/13/17 7:58 PM, L A Walsh wrote:
>
>
> Chet Ramey wrote:
>> On 6/2/17 6:23 PM, L A Walsh wrote:
>>
>>
>>> As for unsupported systems, there is a reason they are no longer
>>> supported. The world is already using UTF-8. It's only a few
>>> luddites clinging to ascii as a last refuge. ;-)
>>>
>>> What display/OS do you have that you can't run UTF-8 on?
>>>
>>
>> This is a red herring. de_DE.UTF-8 and zh_KH.UTF-8 don't use the same
>> character set.
>>
> ---
> The use the same encoding. Whether or not they use
> the same character set is up to someone's preference. Looking at my
> local fonts, it looks like 'Code2000' covers both of those ranges:
> Germany and Khmer ? But using 1 font for both isn't necessary.
That's not relevant to the issue of whether or not a particular character
is classified as alphabetic in one locale and not another. That is the
largest issue with locale-specific identifiers. I'm not as concerned
with how a particular character displays; that's not important to
interpreting the script.
> Forgive me if I'm misremembering, but hasn't Greg argued against
> the ability to supply "libraries" of re-usable scripts due to
> the ease with which names could conflict with each other and cause
> script incompatibilities?
I'm sure he has. It's a genuine problem without namespaces, so you have
to adopt some naming convention that provides pseudo-namespace
functionality.
> If it is the case that script libraries had access to unicode
> var & func names (and used it), wouldn't that significantly
> decrease the the chances of conflict? Right now, what, ...
> maybe A-Za-z_0-9 + maybe a few others == that's about 64 chars?
63 x as many characters are in your identifier. You can easily choose
some prefix (like readline uses _rl_ and rl_) and reduce the potential
for clashes.
> Even if a character doesn't display in your locality, doesn't
> mean it wouldn't work -- i.e. if I don't have a Cryllic font
> installed, that doesn't mean the script wouldn't work -- as
> the characters would still be encoded as their Unicode values.
A character that is classified as being a valid alphabetic in one
locale may not be such in another, regardless of its encoding.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU [email protected] http://cnswww.cns.cwru.edu/~chet/