[R] Selection and aggregation in one operation?

Gabor Grothendieck ggrothendieck at gmail.com
Tue May 26 14:42:11 CEST 2009


If, as in this example, i is always 1, 2, ... and has an odd length
in each group then:

do.call(rbind, by(DF, DF$id, function(x) x[median(x$i), ]))


On Tue, May 26, 2009 at 8:13 AM, Zeljko Vrba <zvrba at ifi.uio.no> wrote:
> I have a large data-frame with measurements such as:
>
> id i v1  v2  v3
> 1  1 1.1 1.2 1.3
> 1  2 1.4 1.5 1.6
> 1  3 1.5 1.7 1.8
> 2  1 2.1 2.2 2.3
> 2  2 2.7 2.5 2.6
> 2  3 2.4 2.8 2.9
>
> For each unique value of 'id' (which in the real data-set is a combination of
> three variables) I want to compute the median of v1 within each group ('i'
> distinguishes measurements within a group), and copy the value of the remaining
> columns (v2 and v3).  Thus, the desired result for this small example is
>
> id i v1  v2  v3
> 1  2 1.4 1.5 1.6
> 2  3 2.4 2.8 2.9
>
> I have written a (rather clumsy, in my opinion) function to perform this task
> (see below).  Is there a more "standard" way of achieving this?
>
> The function is:
> agg.column <- function(df, key, groups, FUN)
> {
>  for(i in 1:length(groups))
>    groups[[i]] <- as.factor(groups[[i]])
>  groups <- split(df, interaction(groups, lex.order=TRUE))
>  ret <- data.frame()
>
>  for(g in groups) {
>    key.fun <- FUN(g[[key]])
>    row.idx <- match(key.fun, g[[key]])
>    ret <- rbind(ret, g[row.idx,])
>  }
>  ret
> }
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list