[R] More efficient way to use ifelse()?

Duncan Murdoch murdoch.duncan at gmail.com
Wed May 26 11:43:49 CEST 2010


Ian Dworkin wrote:
> # This is more about trying to find a more effecient way to code some
> simple vectorized computations using ifelse().
>
> # Say you have some vector representing a factor with a number of
> levels (6 in this case), representing the location that samples were
> collected.
>
> Population <- gl( n=6, k=5,length=120, labels =c("CO", "CN","Ga","KO",
> "Mw", "Ng"))
>
>
> # You would like to assign a particular value to each level of
> population (in this case the elevation at which they were collected).
> In a vectorized approach (for speed... pretend this was a big data
> set..)
>
> elevation <-  ifelse(Population=="CO", 2169,
>  ifelse(Population=="CN", 1121,
>   ifelse(Population=="Ga", 500,
>     ifelse(Population=="KO", 2500,
>     	ifelse(Population=="Mw", 625,
>     	  ifelse(Population=="Ng", 300, NA ))))))
>     	
> # Which is fine, but is a pain to write...
>
> # So I was trying to think about how to vectorize directly. i.e use
> vectors within the test, and for return values for T and F
>
> elevation.take.2 <- ifelse(Population==c("CO",  "CN", "Ga", "KO",
> "Mw", "Ng"), c(2169, 1121, 500, 2500, 625, 300), c(NA, NA, NA, NA, NA,
> NA))
>
> # It makes sense to me why this does not work (elevation.take.2), but
> I am not sure how to get it to work. Any suggestions? I suspect it
> involves a trick using "any" or "II" or something, but I can't seem to
> work it out.
>   

In a case like this, often indexing is clearer than ifelse.  For example,

results <- c(CN=1121, Ga = 500, KO=2500, Mw = 625, Ng = 300)
elevation <- results[Population]

Generally vector indexing of atomic vectors and matrices is very fast; 
indexing of data frames is much slower, so if speed is an issue, avoid them.

Duncan Murdoch



More information about the R-help mailing list