[R] tapply within a data.frame: a simpler alternative?
Peter Dalgaard
P.Dalgaard at biostat.ku.dk
Wed Dec 10 18:29:43 CET 2008
baptiste auguie wrote:
> Dear list,
>
> I have a data.frame with x, y values and a 3-level factor "group", say.
> I want to create a new column in this data.frame with the values of y
> scaled to 1 by group. Perhaps the example below describes it best:
>
>> x <- seq(0, 10, len=100)
>> my.df <- data.frame(x = rep(x, 3), y=c(3*sin(x), 2*cos(x), cos(2*x)),
>> # note how the y values have a different maximum depending on the group
>> group = factor(rep(c("sin", "cos", "cos2"), each=100)))
>> library(reshape)
>> df.melt <- melt(my.df, id=c("x","group")) # make a long format
>> df.melt <- df.melt[ order(df.melt$group) ,] # order the data.frame by
>> the group factor
>> df.melt$norm <- do.call(c, tapply(df.melt$value, df.melt$group,
>> function(.v) {.v / max(.v)})) # calculate the normalised value per
>> group and assign it to a new column
>> library(lattice)
>> xyplot(norm + value ~ x,groups=group, data=df.melt, auto.key=T) #
>> check that it worked
>
>
> This procedure works, but it feels like I'm reinventing the wheel using
> hammer and saw. I tried to use aggregate, by, ddply (plyr package), but
> I coudn't find anything straight-forward.
>
> I'll appreciate any input,
You (as many before you) have overlooked the ave() function, which can
replace the ordering as well the do.call(c,tapply(....))
Also, I fail to see what good the melt()ing is for:
> dim(my.df)
[1] 300 3
> dim(melt(my.df, id=c("x","group")) )
[1] 300 4
And the extra column is just "y"
my.df <- transform(my.df, norm=ave(y, group,
function(.v) {.v / max(.v)}))
xyplot(norm + y ~ x,groups=group, data=my.df, auto.key=T)
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list