[Rd] c.factor

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Nov 14 19:22:59 CET 2006


Well, R has managed without a factor method for c() for most of its decade 
of existence (not that it originally had factors as we know them).

I would argue that factors are best viewed as an enumeration type, and 
anything which silently changes their level set is a bad idea.  I can see 
a case for a c() method for factors that combines factors with the same 
level sets, but I can also see this is best done by users who know the 
level sets are same (c.factor would have to expend a considerable effort 
to check).

You also need to consider the dispatch rules.  c.factor will be called 
whenever the first argument is a factor, whatever the others are. S4 (I 
think, definitely S4-based versions of S-PLUS) has an alternative concat() 
that works differently (recursively) and seems a more natural model.


On Tue, 14 Nov 2006, Marc Schwartz wrote:

> On Tue, 2006-11-14 at 11:51 -0600, Marc Schwartz wrote:
>> On Tue, 2006-11-14 at 16:36 +0000, Matthew Dowle wrote:
>>> Hi,
>>>
>>> Given factors x and y,  c(x,y) does not seem to return a useful result :
>>>> x
>>> [1] a b c d e
>>> Levels: a b c d e
>>>> y
>>> [1] d e f g h
>>> Levels: d e f g h
>>>> c(x,y)
>>>  [1] 1 2 3 4 5 1 2 3 4 5
>>>>
>>>
>>> Is there a case for a new method c.factor as follows?  Does something
>>> similar exist already?  Is there a better way to write the function?
>>>
>>>> c.factor = function(x,y)
>>> {
>>>     newlevels = union(levels(x),levels(y))
>>>     m = match(levels(y), newlevels)
>>>     ans = c(unclass(x),m[unclass(y)])
>>>     levels(ans) = newlevels
>>>     class(ans) = "factor"
>>>     ans
>>> }
>>>> c(x,y)
>>>  [1] a b c d e d e f g h
>>> Levels: a b c d e f g h
>>>> as.integer(c(x,y))
>>>  [1] 1 2 3 4 5 4 5 6 7 8
>>>>
>>>
>>> Regards,
>>> Matthew
>>
>> I'll defer to others as to whether or not there is a basis for c.factor,
>> however:
>>
>> c.factor <- function(...)
>> {
>>   args <- list(...)
>>
>>   # this could be optional
>>   if (!all(sapply(args, is.factor)))
>>    stop("All arguments must be factors")
>>
>>   factor(unlist(lapply(args, function(x) as.character(x))))
>> }
>
>
> That last line can even be cleaned up, as I was doing something else
> initially:
>
> c.factor <- function(...)
> {
>  args <- list(...)
>
>  if (!all(sapply(args, is.factor)))
>   stop("All arguments must be factors")
>
>  factor(unlist(lapply(args, as.character)))
> }
>
>
> Marc
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-devel mailing list