[R] collapsing a data frame
hadley wickham
h.wickham at gmail.com
Sat Oct 13 06:53:20 CEST 2007
> > Here's a solution that takes the first element of each factor
> > and the mean of each numeric variable. I can imagine there
> > are more general/flexible solutions. (One might want to
> > specify more than one summary function, or specify that
> > factors that vary within group should be dropped.)
> >
> > vtype = sapply(h,class) ## variable types [numeric or factor]
> > vtypes = unique(vtype) ## possible types
> > v2 = lapply(vtypes,function(z) which(vtype==z)) ## which are which?
> > cfuns = list(factor=function(z)z[1],numeric=mean)## functions to apply
> > m = mapply(function(w,f) { aggregate(h[w],list(h$BROOD),f) },
> > v2,cfuns,SIMPLIFY=FALSE)
> > data.frame(m[[1]],m[[2]][-1])
> >
> > My question is whether this is re-inventing the wheel. Is there
> > some function or package that performs this task?
>
> Maybe the reshape package? http://had.co.nz/reshape
>
> hm <- melt(h, m = "TICKS")
> cast(hm, BROOD + HEIGHT + YEAR + LOCATION ~ ., mean)
> cast(hm, BROOD + HEIGHT + LOCATION ~ YEAR, mean)
> cast(hm, BROOD ~ HEIGHT ~ YEAR, mean)
>
> You should be able to create just about any data structure you need,
> and if you can't let me know.
Oh, and you can easily use multiple summary functions too:
cast(hm, BROOD + HEIGHT + YEAR + LOCATION ~ ., c(mean, sd, length))
Hadley
--
http://had.co.nz/
More information about the R-help
mailing list