[R] Basic question on concatenating factors
jim holtman
jholtman at gmail.com
Sun Nov 23 06:08:23 CET 2008
You are right. union used 'unique(c(x,y))' and I am not sure if
'unique' preserves the order, but the help page seems to indicate that
"an element is omitted if it is identical to any previous element ";
this might mean that the order is preserved.
On Sat, Nov 22, 2008 at 11:43 PM, Stavros Macrakis
<macrakis at alum.mit.edu> wrote:
> On Sat, Nov 22, 2008 at 10:20 AM, jim holtman <jholtman at gmail.com> wrote:
>> c.Factor <-
>> function (x, y)
>> {
>> newlevels = union(levels(x), levels(y))
>> m = match(levels(y), newlevels)
>> ans = c(unclass(x), m[unclass(y)])
>> levels(ans) = newlevels
>> class(ans) = "factor"
>> ans
>> }
>
> This algorithm depends crucially on union preserving the order of the
> elements of its arguments. As far as I can tell, the spec of union
> does not require this. If union were to (for example) sort its
> arguments then merge them (generally a more efficient algorithm), this
> function would no longer work.
>
> Fortunately, the fix is simple. Instead of union, use:
>
> newlevels <- c(levels(x),setdiff(levels(y),levels(x))
>
> which is guaranteed to preserve the order of levels(x).
>
> -s
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
More information about the R-help
mailing list