[Rd] Why is there no c.factor?
Matthew Dowle
mdowle at mdowle.plus.com
Thu Feb 4 19:42:33 CET 2010
A search for "c.factor" returns tons of hits on this topic.
Heres just one of the hits from 2006, when I asked the same question :
http://tolstoy.newcastle.edu.au/R/e2/devel/06/11/1137.html
So it appears to be complicated and there are good reasons.
Since I needed it, I created c.factor in data.table package, below. It does
it more efficiently since it doesn't convert each factor to character (hence
losing some of the benefit). I've been told I'm not unique in this approach
and that other packages also have their own c.factor. It deliberately isn't
exported. Its worked well for me over the years anyway.
c.factor = function(...)
{
args <- list(...)
for (i in seq(along=args)) if (!is.factor(args[[i]])) args[[i]] =
as.factor(args[[i]])
# The first must be factor otherwise we wouldn't be inside c.factor, its
checked anyway in the line above.
newlevels = sort(unique(unlist(lapply(args,levels))))
ans = unlist(lapply(args, function(x) {
m = match(levels(x), newlevels)
m[as.integer(x)]
}))
levels(ans) = newlevels
class(ans) = "factor"
ans
}
"Hadley Wickham" <hadley at rice.edu> wrote in message
news:f8e6ff051002040753x33282f33l78fce9f98dc29ae8 at mail.gmail.com...
> Hi all,
>
> Is there are reason that there is no c.factor method? Analogous to
> c.Date, I'd expect something like the following to be useful:
>
> c.factor <- function(...) {
> factors <- list(...)
> levels <- unique(unlist(lapply(factors, levels)))
> char <- unlist(lapply(factors, as.character))
>
> factor(char, levels = levels)
> }
>
> c(factor("a"), factor("b"), factor(c("c", "b","a")), factor("d"))
> # [1] a b c b a d
> # Levels: a b c d
>
> Hadley
>
> --
> http://had.co.nz/
>
More information about the R-devel
mailing list