[R] aggregate() runs out of memory

Sam Steingold sds at gnu.org
Tue Nov 27 17:29:02 CET 2012


> * Steve Lianoglou <znvyvatyvfg.ubarlcbg at tznvy.pbz> [2012-11-26 19:47:25 -0500]:
>
> On Monday, November 26, 2012, Sam Steingold wrote:
> [snip]
>
>>
>> there is precisely one country for each id.
>> i.e., unique(country) is the same as country[1].
>> thanks a lot for the suggestion!
>>
>> > R> result <- f[, list(min=min(delay), max=max(delay),
>> > count=.N,country=country[1L]), by="share.id"]
>
>
> And is it performant?

acceptable.

> It just occurred to me that this is even better:
>
> R> setkeyv(f, c("share.id", "delay"))
> R> result <- f[,  list(min=delay[1L], max=delay[.N], count=.N,
> country=country[1L]), by="share.id"]
>

this assumes that delays are sorted (like in my example)
which, in reality, they are not.
thanks for your help!

-- 
Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000
http://www.childpsy.net/ http://honestreporting.com
http://americancensorship.org http://memri.org http://www.memritv.org
Illiterate?  Write today, for free help!




More information about the R-help mailing list