[Rd] apply: new behaviour for factors in R-2.4.0
Christoph Buser
buser at stat.math.ethz.ch
Thu Sep 28 16:32:45 CEST 2006
Dear Brian
Thank you for your answer and the comment you included on the
apply() help page.
1)
You are correct. My data.frame is coerced into a matrix in
apply()
2)
I agree that the new version of unlist() is better and works
correctly and that in array() (due to as.vector()) the factor
"ans" is coerced into a character matrix.
Nevertheless I disagree that this is "feature freeze" with
R version 2.3.1:
Since in R-2.3.1, unlist() on a list of factors returned an
integer vector, the result of apply was an integer matrix and
not a character matrix.
Therefore my question is if it would be desirable to return an
integer matrix by changing apply. One could include additional
code to handle the case if the output "should" be a factor
matrix and coerce into an integer matrix.
Then the outcome would be consistent with R-2.3.1 without
changing something in unlist() or array().
But in the end I am not sure if an integer matrix is better than
a character matrix or a factor matrix. I am not sure what output
is best if one uses as.factor in apply.
Regards,
Christoph
--------------------------------------------------------------
Christoph Buser <buser at stat.math.ethz.ch>
Seminar fuer Statistik, LEO C13
ETH Zurich 8092 Zurich SWITZERLAND
phone: x-41-44-632-4673 fax: 632-1228
http://stat.ethz.ch/~buser/
--------------------------------------------------------------
Prof Brian Ripley writes:
> Christoph,
>
> This is more complicated than your analysis.
>
> 1) apply takes a matrix as an argument, not a data frame, and so first
> coerced 'dat' to a character matrix.
>
> 2) unlist is working quite correctly. The issue is array(), which
> contains as.vector(data). Thus although the result could be a factor
> matrix, as.vector is coercing it to a character matrix. It might be
> desirable to return a factor matrix, but we are not going to do that in
> feature freeze (if ever) and I really don't think it would be what you
> wanted.
>
> Perhaps the help page should contain an explicit statement that the result
> will be coerced to a basic vector type by as.vector().
>
> On Mon, 25 Sep 2006, Christoph Buser wrote:
>
> > Dear R-core
> >
> > There is a different output for the apply function due to the
> > change of unlist as mentioned in the R news.
> >
> > Newly, applying as.factor() (or factor()) in
> >
> > str(dat <- data.frame(x = 1:10, f1 = gl(2,5,labels = c("A", "B"))))
> > (d1 <- apply(dat,2,as.factor))
> >
> > newly returns a character matrix while in R-2.3.1 the same
> > command resulted in an integer matrix that was consistent (up to
> > the ordering of the factor levels) with data.matrix().
>
> That's coincidence -- try x=11:20.
>
> > The change is caused by the change of unlist() that, used for a
> > list of factors, newly returns a single factor instead of an
> > integer. I am happy with this change, but:
> >
> > Is it desirable to change apply so that it does not return a
> > character matrix in the example above or include a warning for
> > such a case?
> >
> > Thank you very much for an answer.
> >
> > Regards,
> >
> > Christoph Buser
> >
> > --------------------------------------------------------------
> > Christoph Buser <buser at stat.math.ethz.ch>
> > Seminar fuer Statistik, LEO C13
> > ETH Zurich 8092 Zurich SWITZERLAND
> > phone: x-41-44-632-4673 fax: 632-1228
> > http://stat.ethz.ch/~buser/
> >
> > ______________________________________________
> > R-devel at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-devel
mailing list