[R] Improve code efficient with do.call, rbind and split contruction

Charles C. Berry ccberry at ucsd.edu
Fri Sep 2 20:50:34 CEST 2016


On Fri, 2 Sep 2016, Bert Gunter wrote:
[snip]
>
> The "trick" is to use tapply() to select the necessary row indices of
> your data frame and forget about all the do.call and rbind stuff. e.g.
>

I agree the way to go is "select the necessary row indices" but I get 
there a different way. See below.

>> set.seed(1001)
>> df <- data.frame(f =factor(sample(LETTERS[1:4],100,rep=TRUE)),
> +                  g <- factor(sample(letters[1:6],100,rep=TRUE)),
> +                  y = runif(100))
>>
>> ix <- seq_len(nrow(df))
>>
>> ix <- with(df,tapply(ix,list(f,g),function(x)x[length(x)]))
>> ix
>   a  b   c  d  e  f
> A 94 69 100 59 80 87
> B 89 57  65 90 75 88
> C 85 92  86 95 97 62
> D 47 73  72 74 99 96


   jx <- which( !duplicated( df[,c("f","g")], fromLast=TRUE ))

   xtabs(jx~f+g,df[jx,]) ## Show equivalence to Bert's `ix'

    g
f     a   b   c   d   e   f
   A  94  69 100  59  80  87
   B  89  57  65  90  75  88
   C  85  92  86  95  97  62
   D  47  73  72  74  99  96


Chuck



More information about the R-help mailing list