[R] How to apply a function to subsets of a data frame *and* obtain a data frame again?

Paul Hiemstra paul.hiemstra at knmi.nl
Wed Aug 17 13:57:30 CEST 2011


 On 08/17/2011 11:51 AM, Marius Hofert wrote:
> Dear all, 
>
> thanks a lot for the quick help. 
> Below is what I built with the hint of Nick.
>
> Cheers,
>
> Marius
>
>
> library(plyr)
>
> set.seed(1)
> (df <- data.frame(Group=rep(c("Group1","Group2","Group3"), each=10), 
>                 Value=c(rexp(10, 1), rexp(10, 4), rexp(10, 10)))[sample(1:30,30),])
> edf <- function(x) ecdf(x)(x) 
>
> ddply(df, .(Group), function(df.) cbind(df., edf=edf(df.$Value))) 

Hadley's code is much shorter, I would use that syntax.

cheers,
Paul

>
> On 2011-08-17, at 13:38 , Hadley Wickham wrote:
>
>>> The following example does what you want using ddply:
>>>
>>> library(plyr)
>>> edfPerGroup = ddply(df, .(Group), summarise, edf = edf(Value), Value =
>>> Value)
>> Or slightly more succinctly:
>>
>> ddply(df, .(Group), mutate, edf = edf(Value))
>>
>> Hadley
>>
>> -- 
>> Assistant Professor / Dobelman Family Junior Chair
>> Department of Statistics / Rice University
>> http://had.co.nz/


-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770



More information about the R-help mailing list