[R] expand.grid and the first level of a factor
Martin Maechler
maechler at stat.math.ethz.ch
Sat May 3 16:03:27 CEST 2003
>>>>> "UweL" == Uwe Ligges <ligges at statistik.uni-dortmund.de>
>>>>> on Sat, 03 May 2003 15:23:59 +0200 writes:
UweL> Giovanni Marchetti wrote:
>> I do not understand this behaviour of expand.grid:
>>
>>
>>> expand.grid(x = c("b", "a"), y = c(1, 2))$x
>>
>> [1] b a b a
>> Levels: b a
>>
>>> expand.grid(x = c("b", "a"))$x
>>
>> [1] b a
>> Levels: a b
>>
>> Why the first level of the factor x depends on the number
>> of arguments of expand.grid? Apparently, I can set
>> the order of the levels only when the number of
>> arguments in > 1. In the second example, the order
>> is lexicographic.
>>
>> -- Giovanni
UweL> It depends on the number of arguments, because of the implementation
UweL> (look into the code):
UweL> In principle, expand.grid(x = c("b", "a")) does the following:
UweL> x <- c("b", "a")
UweL> factor(x)
UweL> whereas for expand.grid(x = c("b", "a"), y = c(1, 2)), the levels will
UweL> be specified as in:
UweL> factor(x, levels = unique(x))
UweL> Hence the difference.
which seems not perfect to me.
Factor() itself,
> str(factor)
function (x, levels = sort(unique.default(x), na.last = TRUE),
labels = levels, exclude = NA, ordered = is.ordered(x))
does sort the levels by default, and that's what happens in the
one argument case via data.frame().
S-plus 6.1 does the same for factor() but it doesn't sort the
levels of expand.grid() arguments in any case.
I'm just now testing a patch to our expand.grid() which doesn't
treat the one argument case specially as now and seems to cure
the whole "infelicity"...
I can not imagine that anyone's code relies on the current
behavior as opposed to the more consistent one.
Martin
More information about the R-help
mailing list