On 6/2/17 12:52 AM, dualbus wrote: > - There are some questions that must be answered first: > > * How do you how to decode multibyte character sequences into Unicode? > Should UTF-8 be assumed?
It has to be the current locale. > * Will the parsing of a script depend upon the user locale? Only in the sense that identifiers will depend on the current locale. > * Should this special parsing code be disabled if POSIX mode is > enabled? Yes. Posix requires that variables be names, as defined below. However, it should be possible to enable it while in Posix mode as an extension. > * Right now `name' or `identifier' is defined as: > > name: A word consisting only of alphanumeric characters and > underscores, and beginning with an alphabetic character or an > underscore. Also referred to as an identifier. > > How will the definition look like with Unicode identifiers? Add 'from the current locale's character set'. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU c...@case.edu http://cnswww.cns.cwru.edu/~chet/