[R] tapply within a data.frame: a simpler alternative?

hadley wickham h.wickham at gmail.com
Wed Dec 10 18:25:28 CET 2008


On Wed, Dec 10, 2008 at 11:02 AM, baptiste auguie <ba208 at exeter.ac.uk> wrote:
> Dear list,
>
> I have a data.frame with x, y values and a 3-level factor "group", say. I
> want to create a new column in this data.frame with the values of y scaled
> to 1 by group. Perhaps the example below describes it best:
>
>> x <- seq(0, 10, len=100)
>> my.df <- data.frame(x = rep(x, 3), y=c(3*sin(x), 2*cos(x), cos(2*x)), #
>> note how the y values have a different maximum depending on the group
>>        group = factor(rep(c("sin", "cos", "cos2"), each=100)))
>> library(reshape)
>> df.melt <- melt(my.df, id=c("x","group")) # make a long format
>> df.melt <- df.melt[ order(df.melt$group) ,] # order the data.frame by the
>> group factor
>> df.melt$norm <- do.call(c, tapply(df.melt$value, df.melt$group,
>> function(.v) {.v / max(.v)})) # calculate the normalised value per group and
>> assign it to a new column
>> library(lattice)
>> xyplot(norm + value ~ x,groups=group,  data=df.melt, auto.key=T) # check
>> that it worked
>
>
> This procedure works, but it feels like I'm reinventing the wheel using
> hammer and saw. I tried to use aggregate, by, ddply (plyr package), but I
> coudn't find anything straight-forward.

It's pretty easy with ddply:

df.melt <- ddply(df.melt, .(group), transform, norm = y / max(y))

Hadley

-- 
http://had.co.nz/



More information about the R-help mailing list