[Rd] Improve aggregate.default ...?

Gavin Simpson gavin.simpson at ucl.ac.uk
Sat May 9 14:55:20 CEST 2009


On Sat, 2009-05-09 at 08:23 -0400, Gabor Grothendieck wrote:
> Try this:
> 
> > aggregate(dat["A"], dat["Group"], mean)
>   Group         A
> 1     1 0.4944810
> 2     2 0.4765412
> 3     3 0.4521068
> 4     4 0.4989000

Thanks Gabor. Ideally, aggregate.default should "work" whatever indexing
one uses - here you are using the fact that a data.frame is a special
case of a list, and is not the way most help resources introduce
subsetting for data frames.

For personal use, I can use my own version of aggregate.default and as I
dislike using `$`, prefering with(), I don't run the risk of non
syntactic names being produced.

I was really looking for ideas for improving aggregate.default in
general. The solution I posted has its own infelicities...

Cheers,

G

> 
> On Sat, May 9, 2009 at 8:14 AM, Gavin Simpson <gavin.simpson at ucl.ac.uk> wrote:
> > Hi,
> >
> > I find it a bit annoying that aggregate.default forces the returned
> > object to loose the 'name' of the variable aggregated, replacing it with
> > 'x'.
> >
> > A brief example:
> >
> >> dat <- data.frame(A = runif(100), B = rnorm(100),
> > +                   Group = gl(4, 25))
> >> with(dat, aggregate(A, by = list(Group = Group), FUN = mean))
> >  Group         x
> > 1     1 0.6523228
> > 2     2 0.4544317
> > 3     3 0.4619624
> > 4     4 0.4703156
> >
> > This arises because aggregate default has:
> >
> > function (x, ...)
> > {
> >    if (is.ts(x))
> >        aggregate.ts(as.ts(x), ...)
> >    else aggregate.data.frame(as.data.frame(x), ...)
> > }
> >
> > which recasts x as a data frame, but doesn't make any effort to supply a
> > name. Can we do a better job of supplying a useful name?
> >
> > My first attempt is:
> >
> > aggregate.default <- function(x, ...) {
> >    if (is.ts(x))
> >        aggregate.ts(as.ts(x), ...)
> >    else {
> >        nam <- deparse(substitute(x))
> >        x <- as.data.frame(x)
> >        names(x) <- nam
> >        aggregate.data.frame(x, ...)
> >    }
> > }
> >
> > Which works for the brief example above:
> >
> >> with(dat, aggregate(A, by = list(Group = Group), FUN = mean))
> >  Group         A
> > 1     1 0.4269715
> > 2     2 0.5479352
> > 3     3 0.5091543
> > 4     4 0.4926412
> >
> > However, it fails make check-all because examples have relied on
> > returned object having 'x'. I also note that this might have the
> > annoying side effect of producing odd names if we use the following
> > incantation:
> >
> >> res <- aggregate(dat$A, by = list(Group = dat$Group), FUN = mean)
> >> str(res)
> > 'data.frame':   4 obs. of  2 variables:
> >  $ Group: Factor w/ 4 levels "1","2","3","4": 1 2 3 4
> >  $ dat$A: num  0.427 0.548 0.509 0.493
> >> res$dat$A
> > Error in res$dat$A : $ operator is invalid for atomic vectors
> >> res$`dat$A`
> > [1] 0.4269715 0.5479352 0.5091543 0.4926412
> >
> > Is there a way of coming up with a better way to name the aggregated
> > variable? Would a change of this kind be something R Core would consider
> > making to aggregate.default if a good solution is found?
> >
> > Thanks in advance,
> >
> > G
> > --
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> >  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
> >  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
> >  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
> >  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
> >  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%



More information about the R-devel mailing list