[R] question about the aggregate function with respect to order of levels of grouping elements
Gabor Grothendieck
ggrothendieck at gmail.com
Sun Dec 16 15:50:15 CET 2007
This does look strange. Note that aggregate.zoo in the zoo package
would work here:
> library(zoo)
> aggregate(zoo(rnum, dts), as.yearmon, sum)
Jan 2001 Feb 2001 Mar 2001 Apr 2001 May 2001 Jun 2001
4.43610085 0.49842227 7.52139932 1.47917343 10.64459923 -1.22530586
Jul 2001 Aug 2001 Sep 2001 Oct 2001 Nov 2001 Dec 2001
8.19563685 1.57626974 1.28842871 2.50540074 0.71156951 0.54118342
Jan 2002 Feb 2002 Mar 2002 Apr 2002 May 2002 Jun 2002
-0.41292840 -2.41301496 3.23783551 0.63914807 -1.46357402 2.91651492
Jul 2002 Aug 2002 Sep 2002 Oct 2002 Nov 2002 Dec 2002
2.17263290 -2.30981022 -9.60701788 1.16504368 -3.07038254 1.38281927
Jan 2003 Feb 2003 Mar 2003 Apr 2003 May 2003 Jun 2003
4.48761479 2.42455090 -0.03743888 1.11223001 -4.07988016 -1.15116293
Jul 2003 Aug 2003 Sep 2003 Oct 2003 Nov 2003 Dec 2003
-7.15292576 -2.34231702 -0.48132751 11.74252191 2.51063034 -4.35801058
On Dec 16, 2007 9:23 AM, tom soyer <tom.soyer at gmail.com> wrote:
> Hi,
>
> I am using aggregate() to add up groups of data according to year and month.
> It seems that the function aggregate() automatically sorts the levels of
> factors of the grouping elements, even if the order of the levels of factors
> is supplied. I am wondering if this is a bug, or if I missed something
> important. Below is an example that shows what I mean. Does anyone know if
> this is just the way the aggregate function works, or are there ways
> to force aggregate() to keep the order of levels of factors supplied by the
> grouping elements? Thanks!
>
> library(chron)
> dts=seq.dates("1/1/01","12/31/03")
> rnum=rnorm(1:length(dts))
> df=data.frame(date=dts,obs=rnum)
> agg=aggregate(df[,2],list(year=years(df[,1]),month=months(df[,1])),sum)
> levels(agg$month) # aggregate() automatically generates levels sorted by
> alphabet.
>
> [1] "Apr" "Aug" "Dec" "Feb" "Jan" "Jul" "Jun" "Mar" "May" "Nov" "Oct" "Sep"
>
> fmonth=factor(months(df[,1]))
> levels(fmonth) # factor() automatically generates the correct order of
> levels.
>
> [1] "Jan" "Feb" "Mar" "Apr" "May" "Jun" "Jul" "Aug" "Sep" "Oct" "Nov" "Dec"
>
>
> agg2=aggregate(df[,2],list(year=years(df[,1]),month=fmonth),sum)
> levels(agg2$month) # even if a factor with levels in the correct order is
> supplied, aggregate(), sortsthe levels by alphabet regardless.
>
> [1] "Apr" "Aug" "Dec" "Feb" "Jan" "Jul" "Jun" "Mar" "May" "Nov" "Oct" "Sep"
>
>
> --
> Tom
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list