Greg Wooledge wrote:
On Mon, May 21, 2012 at 12:19:26PM -0700, Linda Walsh wrote:
Greg Wooledge wrote:
For instance, on HP-UX 10.20, in the en_US.iso88591 locale:
A a ... B b
Meanwhile, on Debian 6.0, in the en_US.iso88591 locale:
a A ... b B
So which is correct?
Both. Locale collating order is determined by the OS. You cannot
rely on it, unless you set the LC_COLLATE variable to "C" or "POSIX",
in which case you get ASCII behavior (accented letters are not part
of the character set at all).
Anyone wanting to reference an upper or lower case range
[a-z] or [A-Z], is gonna hurt from this.
Correct.
----
This is a prime example of Posix being stupid and bad for
computer science.
They take a deterministic behavior and define it to be
non-deterministic and break 1000's of programs.
They cannot justify this... as they are supposed to document
current practice -- which has never been to consider the interpretation
of a-z/A-Z as random!
Thus they are violating their own rules! How can anyone follow
such lame directions? Who in their right mind would have voted to
make ranges "worthless"....i.e. -- established, standard practice has never
been for such ranges to be worthless -- yet that is exactly what they
voted for.
How is posix following it's own rules? If they don't follow
their own rules -- how can anyone be following these new specifications
which are obviously in conflict with established implementation?