[Rd] Inconsistency in as.data.frame.table for stringsAsFactors
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Sat Jan 23 12:12:54 CET 2010
Stavros Macrakis wrote:
> Martin,
>
> I agree that global options settings that affect computations are
> problematic.
>
> But that's not the issue I was addressing. If for some classes, func.CLASS
> has certain defaults for some arguments, it is surprising that for other
> classes, it has different defaults, whether these defaults are fixed or
> taken from global settings -- when there is no obvious reason for the
> default to vary by class.
>
> -s
"A foolish consistency is the hobgoblin of little minds..."
The thing is that if you are converting the classifying factors of a
table to columns of a data frame, you will presumably prefer that they
come out as factors, retaining level order. The alternative is like this:
> (x <- as.table(c("Rare"=5, "Medium"=2, "Well-done"=6)))
Rare Medium Well-done
5 2 6
> df <- as.data.frame(x, stringsAsFactors=F)
> xtabs(Freq~Var1, data=df)
Var1
Medium Rare Well-done
2 5 6
This is completely different from other cases, where as.data.frame will
auto-convert character variables to factors; e.g., on reading. Having a
global option intended for read.table() interfere with the above kind of
operation, could be a really nasty surprise for the user. (Notice also
that the option was introduced in 2.10.0, before then, noone would
expect that classifying factors could come out as non-factors.
Defaulting to the global option could easily break working code.)
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-devel
mailing list