[Rd] (PR#9666) 'aggregate' should preserve level ordering of
ripley at stats.ox.ac.uk
ripley at stats.ox.ac.uk
Mon May 14 11:04:48 CEST 2007
On Tue, 8 May 2007, prechelt at inf.fu-berlin.de wrote:
> Full_Name: Lutz Prechelt
> Version: 2.4.1
> OS: Windows XP
> Submission from: (NULL) (160.45.111.67)
>
>
> aggregate (from package stats) should preserve the
> ordering of levels of factors it works on and also their
> 'ordered' attribute if present.
> But it does not.
In fact it treats all grouping variables consistently, reducing them to
their levels and then data.frame does as.factor on the resulting column.
It is not at all clear this is desirable. Take the example on the help
page: 'Cold' is reported as a factor even though it is logical. It seems
better not to coerce any of the grouping factors when putting into the
data frame but rather to index the original variable, and I have
implemented that for R-devel: as a side effect your example works as you
would like. This does mean that grouping variables that are not factors
and cannot be inserted into a data frame will no longer work.
> Here is an example:
>
> ff = factor(c("a","b","a","b"),levels=c("b","a"),ordered=T)
> agg = aggregate(1:4, list(groups=ff), sum)
> print(levels(agg$groups)) # should be: "b" "a"
> [1] "a" "b"
> print(is.ordered(agg$groups)) # should be: TRUE
> [1] FALSE
>
> -----
>
> ?aggregate ignores the issue completely:
> - the terms 'order' or 'level' do not occur in the
> text at all
> - the term 'factor' is mentioned only once:
> "The elements of the list will be coerced to
> factors (if they are not already factors)."
>
> -----
>
> This issue made me write the following code used
> for preparing the data for a barchart:
>
> df.a = aggregate(df[,value.var],
> list(grouping=dfgrouping, other=dfsubbar.var),
> FUN=FUN)
> if (is.factor(dfsubbar.var)) { # R 2.4: this should be done by 'aggregate'
> df.a$other = factor(df.a$other,
> levels=levels(dfsubbar.var),
> ordered=is.ordered(dfsubbar.var))
> }
>
> Cumbersome.
>
> R is great anyway. Thanks for your service building it!
>
> Lutz Prechelt
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list