[R] bug in interaction order when using drop?

Petr Pikal petr.pikal at precheza.cz
Fri Aug 11 14:07:44 CEST 2006


On 11 Aug 2006 at 12:31, Prof Brian Ripley wrote:

Date sent:      	Fri, 11 Aug 2006 12:31:55 +0100 (BST)
From:           	Prof Brian Ripley <ripley at stats.ox.ac.uk>
To:             	Petr Pikal <petr.pikal at precheza.cz>
Copies to:      	r-help at stat.math.ethz.ch
Subject:        	Re: [R] bug in interaction order when using drop?

> On Thu, 10 Aug 2006, Petr Pikal wrote:
> 
> > Ooops, my first suggestion reorders factor itself but
> > 
> > if (drop) factor(ans) else ans
> > 
> > instead of whole drop construction shall preserve levels order
> > without changing order of factor
> 
> Even easier would be to return ans[,drop=drop].  It seems to me that
> there is an argument for expecting interaction(..., drop=TRUE) to give
> the same result as interaction(...)[,drop=TRUE], but little argument
> that any ordering is a *bug*.

Maybe bug was an *exaggeration*, but what surprised me was different 
order in using interaction with and without drop. Well, I would call 
it not consistent behaviour as omitting unused levels silently change 
an order of factor levels.

> set.seed(1)
> DF<-data.frame(x=sample(LETTERS[1:3],10, replace=T), 
y=sample(letters[1:3],10, replace=T))

> interaction(DF$x,DF$y)
 [1] A.a B.a B.c C.b A.c C.b C.c B.c B.b A.c
Levels: A.a B.a C.a A.b B.b C.b A.c B.c C.c

Here is neat ordering, however as you said first level varying 
fastest.

> interaction(DF$x,DF$y, drop=T)
 [1] A.a B.a B.c C.b A.c C.b C.c B.c B.b A.c
Levels: A.a B.a B.c C.b A.c C.c B.b

This seems to me chaotic, but I will be glad if you explain to me 
some rational pattern in it.

> my.int(DF$x,DF$y,drop=T) # changed as suggested
 [1] A.a B.a B.c C.b A.c C.b C.c B.c B.b A.c
Levels: A.a B.a B.b C.b A.c B.c C.c
>

Same ordering as without drop, with unused levels omitted.
Best regards.
Petr Pikal

> 
> The order of the levels of a factor are arbitrary, and in fact they
> seem to me to be in a strange order, with the levels of the first
> factor varying fastest (reverse lexiographic order).
> 
> > levels(interaction(c("A", "A", "B"), letters[1:3]))
> [1] "A.a" "B.a" "A.b" "B.b" "A.c" "B.c"
> 
> so the existing
> 
> > levels(interaction(c("A", "A", "B"), letters[1:3], drop=T))
> [1] "A.a" "A.b" "B.c"
> 
> looks more sensible in this case.
> 
> > 
> > Petr
> > 
> > On 10 Aug 2006 at 16:32, Petr Pikal wrote:
> > 
> > From:           	"Petr Pikal" <petr.pikal at precheza.cz>
> > To:             	r-help at stat.math.ethz.ch
> > Date sent:      	Thu, 10 Aug 2006 16:32:54 +0200
> > Priority:       	normal
> > Subject:        	[R] bug in interaction order when using drop?

<snip>

> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html and provide commented,
> > minimal, self-contained, reproducible code.
> > 
> 
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self) 1 South
> Parks Road,                     +44 1865 272866 (PA) Oxford OX1 3TG,
> UK                Fax:  +44 1865 272595

Petr Pikal
petr.pikal at precheza.cz



More information about the R-help mailing list