[R] sum specific rows in a data frame

hadley wickham h.wickham at gmail.com
Thu Apr 15 16:42:13 CEST 2010


I think the development version also fixes that problem, but it's hard
to know without a reproducible example ....

Hadley

On Thu, Apr 15, 2010 at 2:33 PM, Jeff Newmiler <jdnewmil at dcn.davis.ca.us> wrote:
> This is good news, although I have recently encountered what I consider excessive memory usage in the addition of key columns that don't affect the number of groups.  For example, grouping by Year and Month, if I add MonthBegin, a POSIXct column from which the Year and Month columns were derived, I run out of memory.
>
> hadley wickham <h.wickham at gmail.com> wrote:
>
>>On Thu, Apr 15, 2010 at 1:16 AM, Chuck <vijay.nori at gmail.com> wrote:
>>> Depending on the size of the dataframe and the operations you are
>>> trying to perform, aggregate or ddply may be better.  In the function
>>> below, df has the same structure as your dataframe.
>>
>>Current version of plyr:
>>
>>         agg  ddply
>>X10    0.005  0.007
>>X100   0.007  0.026
>>X1000  0.086  0.248
>>X10000 0.577  3.136
>>X1e.05 4.493 44.147
>>
>>Development version of plyr:
>>
>>         agg ddply
>>X10    0.003 0.005
>>X100   0.007 0.007
>>X1000  0.042 0.044
>>X10000 0.410 0.443
>>X1e.05 4.479 4.237
>>
>>So there are some big speed improvements in the works.
>>
>>Hadley
>>
>>
>>--
>>Assistant Professor / Dobelman Family Junior Chair
>>Department of Statistics / Rice University
>>http://had.co.nz/
>>
>>______________________________________________
>>R-help at r-project.org mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
>



-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/



More information about the R-help mailing list