Chet Ramey wrote:
http://lists.gnu.org/archive/html/bug-bash/2012-05/msg00086.html
----
The above relies upon a hack to the algorithm -- use *USEFUL* hack
in most cases, but still a hack.

when I type locale I get:
LANG=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
----
Note...before bash broke UTF-8 compatiblity, I could use
en_US.UTF-8, but now I assert the current need to do the above
is a bug.

I will make no claim about en_US.iso88591 or other locale-specific
charsets.  However, UTF-8 defines collation order the same as ASCII in
the bottom 127 chars.

Bash ignores UTF-8's collation order.  I really do not know if the
odd character collation order is associated with en_US -- but it seems
that collation order of the UTF-8 character set should override the more
general 'en_US'.

For some reason, I am not allowed to use LC_COLLATE=UTF-8:
-bash: warning: setlocale: LC_COLLATE: cannot change locale (UTF-8): No such file or directory

This seems related to the problem -- in that in specifying UTF-8 (
vs. utf8/UTF8), the distinction has been made in perl and other programs
that UTF-8 is the official name -- that comes with an official collation order.

Thus it seems like having LC_COLLATE=UTF-8 generate an error is a booboo 
somewhere
(gnu libs?)...

IF I was in a chinese local and using a chinese local sorting order, I don't 
know if
I would find an option to use ASCII sorting order would be useful.  But I would
find it useful if it respected the UTF-8 collation requirements, as it handles
not only eE, but all the accented forms as well.

So would this be a LibC/icui18n bug?


Reply via email to