[R] Symbol/String comparison in R
Thu Apr 14 12:20:02 CEST 2022
For some issues it can be useful to learn by experiment. It gives you experience and shows you what sorts of error messages you can expect. In the console type things like this:
a>B
gives an error
"a">"B"
FALSE
"I was not able to find answers to my questions (tried Google, Stack Overflow, etc). Please correct me if anything is wrong here."
R has an extensive Help system. That should always be your first place to look. In this case, ?"<" (at the R prompt) brings you to the Help page for comparisons (as would ?Comparison, but only if the 'c" is in upper case, unfortunately). Among lots of other stuff, it says:
"Comparison of strings in character vectors is lexicographic within the strings using the collating sequence of the locale in use: see locales." ... (+ lots more).
Incidentally, rseek.org and rdrr.io are another couple of good places to look for R documentation.
Bert Gunter
"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> Sorry, I am a beginner in R.
> I was not able to find answers to my questions (tried Google, Stack
> Overflow, etc). Please correct me if anything is wrong here.
>
> When comparing symbols/strings in R - raw numeric values are compared
> symbol by symbol starting from left? If raw numeric values are not
> used is there an ASCII / Unicode table where symbols have
> values/ranking/order and R compares those values?
>
> *2) Comparing symbols*
> Letter "a" raw value is 61, letter "b" raw value is 62? Is this correct?
>
> # Raw value for "a" = 61
> a_raw <- charToRaw("a")
> a_raw
>
> # Raw value for "b" = 62
> b_raw <- charToRaw("b")
> b_raw
>
> # equals TRUE
> "a" < "b"
>
> Ok, so 61 is less than 62 so it's TRUE. Is this correct?
>
> *3) Comparing strings #1*
> "1040" <= "12000"
>
> raw_1040 <- charToRaw("1040")
> raw_1040
> #31 *30* (comparison happens with the second symbol) 34 30
>
> raw_12000 <- charToRaw("12000")
> raw_12000
> #31 *32* (comparison happens with the second symbol) 30 30 30
>
> The symbol in the second position is 30 and it's less than 32. Equals
> to true. Is this correct?
>
> *4) Comparing strings #2*
> "1040" <= "10000"
>
> raw_1040 <- charToRaw("1040")
> raw_1040
> #31 30 *34* (comparison happens with third symbol) 30
>
> raw_10000 <- charToRaw("10000")
> raw_10000
> #31 30 *30* (comparison happens with third symbol) 30 30
>
> The symbol in the third position is 34 is greater than 30. Equals to false.
> Is this correct?
>
> *5) Problem - Why does this equal FALSE?* *"A" < "a"*
>
> 41 < 61 # FALSE?
>
> # Raw value for "A" = 41
> A_raw <- charToRaw("A")
> A_raw
>
> # Raw value for "a" = 61
> a_raw <- charToRaw("a")
> a_raw
>
> Why is capitalized "A" not less than lowercase "a"? Based on raw
> values it should be. What am I missing here?
>
> Thanks
> Kristjan
>
>
