Re: RFE: Please allow unicode ID chars in identifiers

Chet Ramey Tue, 13 Jun 2017 11:14:29 -0700

On 6/2/17 12:52 AM, dualbus wrote:

> - There are some questions that must be answered first:
> 
>   * How do you how to decode multibyte character sequences into Unicode? 
>     Should UTF-8 be assumed?


It has to be the current locale.

>   * Will the parsing of a script depend upon the user locale?

Only in the sense that identifiers will depend on the current locale.

>   * Should this special parsing code be disabled if POSIX mode is
>     enabled?

Yes. Posix requires that variables be names, as defined below.  However,
it should be possible to enable it while in Posix mode as an extension.

>   * Right now `name' or `identifier' is defined as:
> 
>       name: A word consisting only of alphanumeric characters and
>       underscores, and beginning with an alphabetic character or an
>       underscore. Also referred to as an identifier.
> 
>     How will the definition look like with Unicode identifiers?

Add 'from the current locale's character set'.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    [email protected]    http://cnswww.cns.cwru.edu/~chet/

Re: RFE: Please allow unicode ID chars in identifiers

Reply via email to