[R] A behaviour pattern that I find mysterious.

Duncan Murdoch murdoch@dunc@n @end|ng |rom gm@||@com
Thu Feb 27 02:35:44 CET 2020


On 26/02/2020 8:09 p.m., Rolf Turner wrote:
> 
> Consider the following:
> 
> x <- letters[1:5]
> x < 0
> 
> This gives
> 
>> [1] FALSE FALSE FALSE FALSE FALSE
> 
> which kind of makes sense, I guess, though I would a priori have
> expected all NAs.
> 
> But then do:
> 
> x[3] <- "*"
> x < 0
> 
> This gives
> 
>> [1] FALSE FALSE  TRUE FALSE FALSE
> 
> which puzzles me.  Why is "*" considered to be less than 0?
> 
> At one point I made the conjecture that it had something to do with the
> ordering of ASCII characters, but it does not seem to.  A little more
> investigation led me to conjecture that all ASCII characters except
> real-live letters and numerals come out as being less than 0.
> 
> Can anyone explain the rationale to me?  Not that it matters a damn.
> Just idle curiosity.

It's doing a string comparison, but ordering will depend on your locale. 
  You can read the ?icuGetCollate help page if you want to spend a lot 
of time reading a help page.  Not sure it'll answer your question, though...

Duncan Murdoch



More information about the R-help mailing list