[R] aggregate() runs out of memory

Sam Steingold sds at gnu.org
Tue Nov 27 19:50:27 CET 2012


> * Steve Lianoglou <znvyvatyvfg.ubarlcbg at tznvy.pbz> [2012-11-27 12:53:23 -0500]:
> On Tue, Nov 27, 2012 at 11:29 AM, Sam Steingold <sds at gnu.org> wrote:
>>> * Steve Lianoglou <znvyvatyvfg.ubarlcbg at tznvy.pbz> [2012-11-26 19:47:25 -0500]:
> [snip]
>>> It just occurred to me that this is even better:
>>>
>>> R> setkeyv(f, c("share.id", "delay"))
>>> R> result <- f[,  list(min=delay[1L], max=delay[.N], count=.N,
>>> country=country[1L]), by="share.id"]
>>>
>>
>> this assumes that delays are sorted (like in my example)
>> which, in reality, they are not.
>
> When you include "delay" in the call to `setkeyv` as I did above, it
> sorts low to high w/in each "share.id" group.

Ah, but then I would have to _sort_ (~n*log(n)) by delay within each ID
group, while all I care about is min/max (~n).

thanks again!

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://think-israel.org http://truepeace.org
http://thereligionofpeace.com http://mideasttruth.com http://www.memritv.org
If You Want Breakfast In Bed, Sleep In the Kitchen.




More information about the R-help mailing list