[Rd] Problem with order() and I()
peter dalgaard
pdalgd at gmail.com
Wed Sep 10 00:54:12 CEST 2014
[This is the note I alluded to earlier today.]
On 08 Sep 2014, at 18:06 , MacQueen, Don <macqueen1 at llnl.gov> wrote:
> I have found that order() fails in a rather arcane circumstance, as in
> this example:
>
>> foo <- I( c('x','\265g') )
>> order(foo)
> Error in if (xi > xj) 1L else -1L : missing value where TRUE/FALSE needed
>> foo <-c('x','\265g')
>> order(foo)
> [1] 1 2
>
>
The oddity is really that it works (for some value of "works") in the unclassed case:
> foo <- I( c('x','\265g') )
> order(foo)
Error in if (xi > xj) 1L else -1L : missing value where TRUE/FALSE needed
> foo[[1]]
[1] "x"
> foo[[2]]
[1] "\xb5g"
> foo[[1]] < foo[[2]]
[1] NA
> foo[[1]] > foo[[2]]
[1] NA
> fee <- c('x','\265g')
> fee[[1]]
[1] "x"
> fee[[2]]
[1] "\xb5g"
> fee[[1]] < fee[[2]]
[1] NA
> fee[[1]] > fee[[2]]
[1] NA
> order(fee)
[1] 2 1
Notice that the unclassed `fee` has exactly the same issue that its elements are incomparable as `foo` does.
The thing is that xtfrm.AsIs will use elementwise comparison, whereas xtfrm.default will use rank(), which somehow manages to do something with character vectors for which the sort order is undefined:
> rank(foo)
Error in if (xi > xj) 1L else -1L : missing value where TRUE/FALSE needed
> rank(fee)
[1] 2 1
(Notice that xtfrm calls rank and vice versa, presumably without creating a loop. I gave up on sorting out the logic.)
>
>> sessionInfo()
> R version 3.1.1 (2014-07-10)
> Platform: x86_64-apple-darwin13.1.0 (64-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> Thanks
> -Don
>
> p.s.
> Just a little background, irrelevant unless one wonders why I¹m using I()
> and \265:
>
> If I were writing new code I wouldn¹t be using I(), since there are better
> ways now to achieve the same end (preventing the creation of factors in
> data frames), but the scripts that use it are quite old, originally
> developed in 2001.
>
> In at least some but perhaps limited contexts, Œ\265¹ produces the greek
> letter mu, and that¹s why I¹m using it. And if I remember correctly, 2001
> is prior to the current R support for locales and extended character sets.
> Using \265 is what I could find at that time to get a mu into my output.
>
> I came across this while checking some things; it¹s not actually breaking
> my scripts, so I doubt it¹s due to any recent change.
>
>
> --
> Don MacQueen
>
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-devel
mailing list