[R] R design (was "Variable passed to function not used in function in select)

Berwin A Turlach berwin at maths.uwa.edu.au
Wed Nov 12 05:04:40 CET 2008

G'day Peter,

On Wed, 12 Nov 2008 00:42:36 +0100
Peter Dalgaard <p.dalgaard at biostat.ku.dk> wrote:

> > On 12/11/2008, at 11:29 AM, Peter Dalgaard wrote:
> >>
> >> Not that one again! For at least one other value of one, the
> >> expectation is the opposite: Factor levels do not go away just
> >> because they happen not to be present in data.
> >>
> >> fct <- lapply(dd, is.factor)
> >> dd[fct] <- lapply(dd[fct], "[", drop=TRUE)
> >>
> (Actually, the last line could have had lapply(dd[fct],factor), I
> just got confused about whether in would preserve the level order.)

That was my first thought too, that lapply(dd[fct], factor) should be
enough.  But then I thought that ordered factors test TRUE for
is.factor and that using factor() on an ordered factor will not only
remove the factor levels that are missing but also remove the
information on the ordering of the remaining levels.  A short test
showed that this is not the case; and after reading the help page of
factor() I realised that this is due to the design of the function.

So perhaps this example should be documented as a case in which the
design decisions of the R developer save a naive user of accidentally
doing a wrong thing (namely turning an ordinal variable into a
nominal). :)



More information about the R-help mailing list