[R] More efficient way to use ifelse()?
Duncan Murdoch
murdoch.duncan at gmail.com
Wed May 26 13:53:20 CEST 2010
Duncan Murdoch wrote:
> Ian Dworkin wrote:
>
>> # This is more about trying to find a more effecient way to code some
>> simple vectorized computations using ifelse().
>>
>> # Say you have some vector representing a factor with a number of
>> levels (6 in this case), representing the location that samples were
>> collected.
>>
>> Population <- gl( n=6, k=5,length=120, labels =c("CO", "CN","Ga","KO",
>> "Mw", "Ng"))
>>
>>
>> # You would like to assign a particular value to each level of
>> population (in this case the elevation at which they were collected).
>> In a vectorized approach (for speed... pretend this was a big data
>> set..)
>>
>> elevation <- ifelse(Population=="CO", 2169,
>> ifelse(Population=="CN", 1121,
>> ifelse(Population=="Ga", 500,
>> ifelse(Population=="KO", 2500,
>> ifelse(Population=="Mw", 625,
>> ifelse(Population=="Ng", 300, NA ))))))
>>
>> # Which is fine, but is a pain to write...
>>
>> # So I was trying to think about how to vectorize directly. i.e use
>> vectors within the test, and for return values for T and F
>>
>> elevation.take.2 <- ifelse(Population==c("CO", "CN", "Ga", "KO",
>> "Mw", "Ng"), c(2169, 1121, 500, 2500, 625, 300), c(NA, NA, NA, NA, NA,
>> NA))
>>
>> # It makes sense to me why this does not work (elevation.take.2), but
>> I am not sure how to get it to work. Any suggestions? I suspect it
>> involves a trick using "any" or "II" or something, but I can't seem to
>> work it out.
>>
>>
>
> In a case like this, often indexing is clearer than ifelse. For example,
>
> results <- c(CN=1121, Ga = 500, KO=2500, Mw = 625, Ng = 300)
> elevation <- results[Population]
>
> Generally vector indexing of atomic vectors and matrices is very fast;
> indexing of data frames is much slower, so if speed is an issue, avoid them.
>
One followup: don't do this if Population is a factor. It will index
by the numeric values rather than the labels. In this example you
should get the same answer since the labels in "results" are in
alphabetical order, but you won't in general.
Duncan Murdoch
More information about the R-help
mailing list