[R] summing a large, partitioned data frame

Benilton Carvalho beniltoncarvalho at gmail.com
Mon Jan 25 18:18:35 CET 2010


check aggregate()   (the examples are quite helpful)

b


On Mon, Jan 25, 2010 at 4:07 PM,  <james.foadi at diamond.ac.uk> wrote:
> Dear R community,
> I'm trying to develop a fast way of summing specific rows of a large data frame.
> Here is an example of the kind of data frames I'm dealing with:
>
>> refls
>      H K L M/ISYM BATCH          I     SIGI
> 43247 1 0 5     21    79   61.44117  2.20553
> 1040  1 0 5    257     6   15.16316  0.54431
> 2324  1 0 5    257     5   46.76152  1.67858
> 31515 1 0 5    259    60   57.97305  2.08104
> 35158 1 0 5    259    61    3.15614  0.11329
> 51575 1 0 6    259    88  380.04477  8.08878
> 51846 1 0 6    259    89  624.90802 13.30038
> 28946 1 1 4      1    42 2517.79492 55.37144
> 23199 1 1 4      5    31 2525.67407 55.54472
> 23198 1 1 4     21    39 2519.44653 55.40777
> ............................................
> ............................................
>
> I need to add up all I's with same H, K, L and M/ISYM.
> The new data frame coming out of this partial summing should look, in this case, like:
>
>      H K L M/ISYM BATCH          I     SIGI
> 43247 1 0 5     21    79   61.44117  2.20553
> 1040  1 0 5    257     6   61.92468  0.54431
> 31515 1 0 5    259    60   61.12919  2.08104
> 51575 1 0 6    259    88 1004.95279  8.08878
> 28946 1 1 4      1    42 2517.79492 55.37144
> 23199 1 1 4      5    31 2525.67407 55.54472
> 23198 1 1 4     21    39 2519.44653 55.40777
> ............................................
> ............................................
>
>
> Essentially I only add those I's with same H, K, L, M/ISYM and replace the sum
> in a unique row in the new data frame. In other words there's first a partition and then
> a sum.
>
> I have tried with a for loop, but it really takes too long.
>
> I was wondering whether anyone knows of a better and faster way of doing this operation.
>
>
> J
>
>
>
> Dr James Foadi PhD
> Membrane Protein Laboratory (MPL)
> Diamond Light Source Ltd
> Diamond House
> Harewell Science and Innovation Campus
> Chilton, Didcot
> Oxfordshire OX11 0DE
>
> Email    :  james.foadi at diamond.ac.uk
> Alt Email:  j.foadi at imperial.ac.uk
>
> --
> This e-mail and any attachments may contain confidential...{{dropped:8}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list