Hi Rui Thank you for the code snippet.
1) How do you find your "Portuguese_Portugal.1252" symbols table now? Is it this https://en.wikipedia.org/wiki/Windows-1252? 2) What attributes and values do you check to validate the end result? I see there is a section "Codepage layout" and I can find "A" and "a" symbols. What values on that table tell you "A" is bigger than "a"? "A" < "a" # returns FALSE "A" > "a" # returns TRUE PS! My locale is Estonian_Estonia.1257 Regards, Kristjan On Thu, Apr 14, 2022 at 5:05 PM Rui Barradas <ruipbarra...@sapo.pt> wrote: > Hello, > > This is a locale issue, you are counting on the ASCII table codes but > that's only valid for the "C" locale. > > old_loc <- Sys.getlocale("LC_COLLATE") > > "A" < "a" > #> [1] FALSE > "A" > "a" > #> [1] TRUE > > Sys.setlocale("LC_COLLATE", locale = "C") > #> [1] "C" > > "A" < "a" > #> [1] TRUE > "A" > "a" > #> [1] FALSE > > Sys.setlocale("LC_COLLATE", old_loc) > #> [1] "Portuguese_Portugal.1252" > > > Hope this helps, > > Rui Barradas > > Às 15:06 de 13/04/2022, Kristjan Kure escreveu: > > Hi! > > > > Sorry, I am a beginner in R. > > > > I was not able to find answers to my questions (tried Google, Stack > > Overflow, etc). Please correct me if anything is wrong here. > > > > When comparing symbols/strings in R - raw numeric values are compared > > symbol by symbol starting from left? If raw numeric values are not used > is > > there an ASCII / Unicode table where symbols have values/ranking/order > and > > R compares those values? > > > > *2) Comparing symbols* > > Letter "a" raw value is 61, letter "b" raw value is 62? Is this correct? > > > > # Raw value for "a" = 61 > > a_raw <- charToRaw("a") > > a_raw > > > > # Raw value for "b" = 62 > > b_raw <- charToRaw("b") > > b_raw > > > > # equals TRUE > > "a" < "b" > > > > Ok, so 61 is less than 62 so it's TRUE. Is this correct? > > > > *3) Comparing strings #1* > > "1040" <= "12000" > > > > raw_1040 <- charToRaw("1040") > > raw_1040 > > #31 *30* (comparison happens with the second symbol) 34 30 > > > > raw_12000 <- charToRaw("12000") > > raw_12000 > > #31 *32* (comparison happens with the second symbol) 30 30 30 > > > > The symbol in the second position is 30 and it's less than 32. Equals to > > true. Is this correct? > > > > *4) Comparing strings #2* > > "1040" <= "10000" > > > > raw_1040 <- charToRaw("1040") > > raw_1040 > > #31 30 *34* (comparison happens with third symbol) 30 > > > > raw_10000 <- charToRaw("10000") > > raw_10000 > > #31 30 *30* (comparison happens with third symbol) 30 30 > > > > The symbol in the third position is 34 is greater than 30. Equals to > false. > > Is this correct? > > > > *5) Problem - Why does this equal FALSE?* > > *"A" < "a"* > > > > 41 < 61 # FALSE? > > > > # Raw value for "A" = 41 > > A_raw <- charToRaw("A") > > A_raw > > > > # Raw value for "a" = 61 > > a_raw <- charToRaw("a") > > a_raw > > > > Why is capitalized "A" not less than lowercase "a"? Based on raw values > it > > should be. What am I missing here? > > > > Thanks > > Kristjan > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.