The following caused a hard-to-diagnose problem for a user of the survey 
package.  Presumably this is a strange Unicode thing, but is there a 
convenient reference for how the collation order is determined? I am 
surprised that adding the same character to the end of two strings of the 
same length can change the sorting order.

in en_US.utf8 locale
> "1//"<"10/"
[1] TRUE
> "1//2"<"10/2"
[1] FALSE

in C locale on the same system.
> "1//"<"10/"
[1] TRUE
> "1//2"<"10/2"
[1] TRUE

[This is in r-devel of March 6, but the problem that was reported to me 
involved Windows vs Linux on released versions]

        -thomas

Thomas Lumley                   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]       University of Washington, Seattle

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to