[Rd] Why is there no c.factor?
Thomas Lumley
tlumley at u.washington.edu
Thu Feb 4 18:06:47 CET 2010
On Thu, 4 Feb 2010, Hadley Wickham wrote:
> Hi all,
>
> Is there are reason that there is no c.factor method? Analogous to
> c.Date, I'd expect something like the following to be useful:
>
> c.factor <- function(...) {
> factors <- list(...)
> levels <- unique(unlist(lapply(factors, levels)))
> char <- unlist(lapply(factors, as.character))
>
> factor(char, levels = levels)
> }
>
> c(factor("a"), factor("b"), factor(c("c", "b","a")), factor("d"))
> # [1] a b c b a d
> # Levels: a b c d
>
It's well established that different people have different views on what factors should do, but this doesn't match mine. I think of factors as enumerated data types where the factor levels already specify all the valid values for the factor, so I wouldn't want to be able to combine two factors with different sets of levels.
For example:
A <- factor("orange",levels=c("orange","yellow","red","purple"))
B <- factor("orange", levels=c("orange","apple","mango", "banananana"))
On the other hand, I think the current behaviour, which reduces them to numbers, is just wrong.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-devel
mailing list