[R] Sum efficiently from large matrix according to re-occuring levels of factor?

hadley wickham h.wickham at gmail.com
Mon Jul 21 01:50:41 CEST 2008


On Sun, Jul 20, 2008 at 4:47 PM, hadley wickham <h.wickham at gmail.com> wrote:
> On Sun, Jul 20, 2008 at 4:16 PM, Ralph S. <ruffel1 at hotmail.com> wrote:
>>
>> Hi,
>>
>> I am trying to calculate the sum for each occurrence of the level of a factor in a very large matrix. In addition, I want to save that sum together with the information of the level of the factor and the level of a second factor.
>>
>> My matrix looks like this:
>>
>> x<-matrix(c(1,1,1,2,2,3,3,1,1,7,7,7,4,4,2,2,7,7,1,1,1,1,1,1,2,5,5),9,3)
>>
>> I want to sum according to the levels in the first column and save the sum with the information of the level in the first and the second column in a new matrix.
>>
>> That is, I want output in the matrix of form:
>>
>> 1 7 3
>> 2 4 2
>> 3 2 3
>> 1 7 10
>>
>
> Why that and not:
>
> 1 7 13
> 2 4 2
> 3 2 3
>
> ?

Here's a solution for that case:

index <- x[, 2] + x[, 1] * max(x[, 2])
cbind(x[!duplicated(index), 1:2], tapply(x[, 3], index, sum))

It takes about half a second for a million row matrix.

Hadley



-- 
http://had.co.nz/



More information about the R-help mailing list