[R] How to avoid ifelse statement converting factor to character

Stavros Macrakis macrakis at alum.mit.edu
Thu Jun 25 17:37:41 CEST 2009


On Wed, Jun 24, 2009 at 9:04 PM, Rolf Turner<r.turner at auckland.ac.nz> wrote:
>  Do not get your knickers in a twist.  R works simply and straightforwardly
>  in simple straightforward situations.

Though I find R an incredibly useful tool, alas, it is simply not true
that "R works simply and straightforwardly in simple straightforward
situations."  No doubt this is for understandable historical reasons
and backwards compatibility, but there it is.

Some examples of simple straightforward situations:

I think it is reasonable to expect that appending a list/vector of
class X to another list/vector of class X would result in a
list/vector of class X.  Similarly for the union of a list/vector of
class X. But in fact, not only is this not true for some of R's
important classes (factors, date/time, and delta-date/time), but the
result class is inconsistent by function and by class:

    ff <- factor("b")
    c(ff,ff)        => 1 1    # class integer
    union(ff,ff) => "b"    # class character

    tt <- as.POSIXct('2009-01-01')
    c(tt,tt) => "2009-01-01 EST" "2009-01-01 EST" # class POSIXt/POSIXct
    union(tt,tt) 1230786000    # class numeric

    dt <- tt - tt       # class difftime
    c(dt,dt)  => 0 0  # class numeric
    union(dt,dt) =>  0  # class numeric

Similarly, the simplest, most straightforward situation I can think of
for ifelse is when the yes and no arguments are identical, and in that
case, I would (I think reasonably) expect that the result is of the
same class as the arguments, but it is not:

     ifelse(TRUE,factor("b"),factor("b")) => 1 (integer)
     ifelse(TRUE,dd,dd) => 1230786000 (class numeric)

I hope you will agree that all of these are very simple and
straightforward situations, and that R is not working simply and
straightforwardly in them.

The less simple and less straightforward situations are of course more
complicated.

>  In respect of the current discussion of ifelse() --- the original problem arose
>  because the values of ``yes'' and ``no'' were of different modes. It is obvious
>  that in such instances a decision will have to be made about the mode
>  of the result.  The appropriateness of the designers' decision may be
> disputed,

Indeed.

> If you don't understand what's going on, then just stick to using
> ifelse() only when ``yes'' and ``no'' have the same mode.

That's not enough.  They have to be of a basic class as well.  See above.

> Bottom line:  R is easy to use at any level, but in order to use it a
> ``high'' level you need to understand the high level.  Don't attempt
> to run before you can crawl.

Bottom line: Some very basic things in R violate users' reasonable
expectations and moreover are internally inconsistent.  You have to be
careful about this whenever you work in R, even at an elementary
level.

           -s




More information about the R-help mailing list