# [R] expand.grid and the first level of a factor

Martin Maechler maechler at stat.math.ethz.ch
Sat May 3 16:03:27 CEST 2003

```>>>>> "UweL" == Uwe Ligges <ligges at statistik.uni-dortmund.de>
>>>>>     on Sat, 03 May 2003 15:23:59 +0200 writes:

UweL> Giovanni Marchetti wrote:
>> I do not understand this behaviour of expand.grid:
>>
>>
>>> expand.grid(x = c("b", "a"), y = c(1, 2))\$x
>>
>> [1] b a b a
>> Levels: b a
>>
>>> expand.grid(x = c("b", "a"))\$x
>>
>> [1] b a
>> Levels: a b
>>
>> Why the first level of the factor x depends on the number
>> of arguments of expand.grid? Apparently, I can set
>> the order of the levels only when the number of
>> arguments in > 1. In the second example, the order
>> is lexicographic.
>>
>> -- Giovanni

UweL> It depends on the number of arguments, because of the implementation
UweL> (look into the code):

UweL> In principle, expand.grid(x = c("b", "a")) does the following:

UweL> x <- c("b", "a")
UweL> factor(x)

UweL> whereas for expand.grid(x = c("b", "a"), y = c(1, 2)), the levels will
UweL> be specified as in:

UweL>    factor(x, levels = unique(x))

UweL> Hence the difference.

which seems not perfect to me.
Factor() itself,
> str(factor)
function (x, levels = sort(unique.default(x), na.last = TRUE),
labels = levels, exclude = NA, ordered = is.ordered(x))

does sort the levels by default, and that's what happens in the
one argument case via data.frame().

S-plus 6.1 does the same for factor() but it doesn't sort the
levels of expand.grid() arguments in any case.

I'm just now testing a patch to our expand.grid() which doesn't
treat the one argument case specially as now and seems to cure
the whole "infelicity"...
I can not imagine that anyone's code relies on the current
behavior as opposed to the more consistent one.

Martin

```