[R] tapply within a data.frame: a simpler alternative?

baptiste auguie ba208 at exeter.ac.uk
Wed Dec 10 18:34:09 CET 2008


On 10 Dec 2008, at 17:25, hadley wickham wrote:

> On Wed, Dec 10, 2008 at 11:02 AM, baptiste auguie  
> <ba208 at exeter.ac.uk> wrote:
>> Dear list,
>>
>> I have a data.frame with x, y values and a 3-level factor "group",  
>> say. I
>> want to create a new column in this data.frame with the values of y  
>> scaled
>> to 1 by group. Perhaps the example below describes it best:
>>
>>> x <- seq(0, 10, len=100)
>>> my.df <- data.frame(x = rep(x, 3), y=c(3*sin(x), 2*cos(x),  
>>> cos(2*x)), #
>>> note how the y values have a different maximum depending on the  
>>> group
>>>       group = factor(rep(c("sin", "cos", "cos2"), each=100)))
>>> library(reshape)
>>> df.melt <- melt(my.df, id=c("x","group")) # make a long format
>>> df.melt <- df.melt[ order(df.melt$group) ,] # order the data.frame  
>>> by the
>>> group factor
>>> df.melt$norm <- do.call(c, tapply(df.melt$value, df.melt$group,
>>> function(.v) {.v / max(.v)})) # calculate the normalised value per  
>>> group and
>>> assign it to a new column
>>> library(lattice)
>>> xyplot(norm + value ~ x,groups=group,  data=df.melt, auto.key=T) #  
>>> check
>>> that it worked
>>
>>
>> This procedure works, but it feels like I'm reinventing the wheel  
>> using
>> hammer and saw. I tried to use aggregate, by, ddply (plyr package),  
>> but I
>> coudn't find anything straight-forward.
>
> It's pretty easy with ddply:
>
> df.melt <- ddply(df.melt, .(group), transform, norm = y / max(y))
>
> Hadley
>
> --
> http://had.co.nz/



Very nice indeed! My test failed as I somehow misunderstood the syntax  
and didn't think of applying transform().

Many thanks too,

baptiste



More information about the R-help mailing list