[R] question about the aggregate function with respect to order of levels of grouping elements

Gabor Grothendieck ggrothendieck at gmail.com
Sun Dec 16 15:50:15 CET 2007


This does look strange.  Note that aggregate.zoo in the zoo package
would work here:

> library(zoo)
> aggregate(zoo(rnum, dts), as.yearmon, sum)
   Jan 2001    Feb 2001    Mar 2001    Apr 2001    May 2001    Jun 2001
 4.43610085  0.49842227  7.52139932  1.47917343 10.64459923 -1.22530586
   Jul 2001    Aug 2001    Sep 2001    Oct 2001    Nov 2001    Dec 2001
 8.19563685  1.57626974  1.28842871  2.50540074  0.71156951  0.54118342
   Jan 2002    Feb 2002    Mar 2002    Apr 2002    May 2002    Jun 2002
-0.41292840 -2.41301496  3.23783551  0.63914807 -1.46357402  2.91651492
   Jul 2002    Aug 2002    Sep 2002    Oct 2002    Nov 2002    Dec 2002
 2.17263290 -2.30981022 -9.60701788  1.16504368 -3.07038254  1.38281927
   Jan 2003    Feb 2003    Mar 2003    Apr 2003    May 2003    Jun 2003
 4.48761479  2.42455090 -0.03743888  1.11223001 -4.07988016 -1.15116293
   Jul 2003    Aug 2003    Sep 2003    Oct 2003    Nov 2003    Dec 2003
-7.15292576 -2.34231702 -0.48132751 11.74252191  2.51063034 -4.35801058


On Dec 16, 2007 9:23 AM, tom soyer <tom.soyer at gmail.com> wrote:
> Hi,
>
> I am using aggregate() to add up groups of data according to year and month.
> It seems that the function aggregate() automatically sorts the levels of
> factors of the grouping elements, even if the order of the levels of factors
> is supplied. I am wondering if this is a bug, or if I missed something
> important. Below is an example that shows what I mean. Does anyone know if
> this is just the way the aggregate function works, or are there ways
> to force aggregate() to keep the order of levels of factors supplied by the
> grouping elements? Thanks!
>
> library(chron)
> dts=seq.dates("1/1/01","12/31/03")
> rnum=rnorm(1:length(dts))
> df=data.frame(date=dts,obs=rnum)
> agg=aggregate(df[,2],list(year=years(df[,1]),month=months(df[,1])),sum)
> levels(agg$month) # aggregate() automatically generates levels sorted by
> alphabet.
>
> [1] "Apr" "Aug" "Dec" "Feb" "Jan" "Jul" "Jun" "Mar" "May" "Nov" "Oct" "Sep"
>
> fmonth=factor(months(df[,1]))
> levels(fmonth) # factor() automatically generates the correct order of
> levels.
>
> [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
>
>
> agg2=aggregate(df[,2],list(year=years(df[,1]),month=fmonth),sum)
> levels(agg2$month) # even if a factor with levels in the correct order is
> supplied, aggregate(), sortsthe levels by alphabet regardless.
>
> [1] "Apr" "Aug" "Dec" "Feb" "Jan" "Jul" "Jun" "Mar" "May" "Nov" "Oct" "Sep"
>
>
> --
> Tom
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list