On 26/02/2020 8:09 p.m., Rolf Turner wrote:
Consider the following:
x <- letters[1:5]
x < 0
This gives
[1] FALSE FALSE FALSE FALSE FALSE
which kind of makes sense, I guess, though I would a priori have
expected all NAs.
But then do:
x[3] <- "*"
x < 0
This gives
[1] FALSE FALSE TRUE FALSE FALSE
which puzzles me. Why is "*" considered to be less than 0?
At one point I made the conjecture that it had something to do with the
ordering of ASCII characters, but it does not seem to. A little more
investigation led me to conjecture that all ASCII characters except
real-live letters and numerals come out as being less than 0.
Can anyone explain the rationale to me? Not that it matters a damn.
Just idle curiosity.
It's doing a string comparison, but ordering will depend on your locale.
You can read the ?icuGetCollate help page if you want to spend a lot
of time reading a help page. Not sure it'll answer your question, though...
Duncan Murdoch
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.