Re: Patch for unicode in varnames...

dualbus Mon, 05 Jun 2017 07:05:24 -0700

On Mon, Jun 05, 2017 at 04:52:19AM -0400, George wrote:
[...]
> To hazard a guess: Each call to legal_identifier() and assignment() in
> the patched code requires copying the parameter and translating it to
> a wide-character string (with no provision for skipping the added work
> as a build option). It appears the memory allocated for these copies
> leaks (I didn't see any added calls to xfree() to go with those new
> xmallocs()), and the character type for the character conversion is
> derived from the user's locale (which means there's not a reliable
> mechanism in place to run a script in a locale whose character
> encoding doesn't match that of the script.) And he did mention "issues
> with compound assignments" as well. Those issues would need to be
> resolved.


Correct. There's also mixed use of wide-character strings and normal
strings, because that was easier to hack quickly.

By the way, ksh93 and zsh already support Unicode identifiers:

  dualbus@debian:~$ for sh in bash mksh ksh93 zsh; do LC_CTYPE=en_US.utf8 $sh 
-c 'φ=phi; echo $φ'; done
  bash: φ=phi: command not found
  $φ
  mksh: φ=phi: not found
  $φ
  phi
  phi

And all of these four support Unicode function names:

  dualbus@debian:~$ for sh in bash mksh ksh93 zsh; do LC_CTYPE=en_US.utf8
  $sh -c 'φ() { echo hi; }; φ'; done
  hi
  hi
  hi
  hi

-- 
Eduardo Bustamante
https://dualbus.me/

Re: Patch for unicode in varnames...

Reply via email to