[R] Selection and aggregation in one operation?

Zeljko Vrba zvrba at ifi.uio.no
Tue May 26 14:13:55 CEST 2009


I have a large data-frame with measurements such as:

id i v1  v2  v3
1  1 1.1 1.2 1.3
1  2 1.4 1.5 1.6
1  3 1.5 1.7 1.8
2  1 2.1 2.2 2.3
2  2 2.7 2.5 2.6
2  3 2.4 2.8 2.9

For each unique value of 'id' (which in the real data-set is a combination of
three variables) I want to compute the median of v1 within each group ('i'
distinguishes measurements within a group), and copy the value of the remaining
columns (v2 and v3).  Thus, the desired result for this small example is

id i v1  v2  v3
1  2 1.4 1.5 1.6
2  3 2.4 2.8 2.9

I have written a (rather clumsy, in my opinion) function to perform this task
(see below).  Is there a more "standard" way of achieving this?

The function is:
agg.column <- function(df, key, groups, FUN)
{
  for(i in 1:length(groups))
    groups[[i]] <- as.factor(groups[[i]])
  groups <- split(df, interaction(groups, lex.order=TRUE))
  ret <- data.frame()

  for(g in groups) {
    key.fun <- FUN(g[[key]])
    row.idx <- match(key.fun, g[[key]])
    ret <- rbind(ret, g[row.idx,])
  }
  ret
}




More information about the R-help mailing list