[R] correction to the previously asked question (about merging factors)
Spencer Graves
spencer.graves at pdf.com
Fri Feb 6 00:44:57 CET 2004
Thanks, Peter.
So Sundar's more elegant solution is equivalent to my initial
response to this question -- which shows how much one can lose trying to
be too clever.
Best Wishes,
spencer graves
Peter Dalgaard wrote:
>Spencer Graves <spencer.graves at pdf.com> writes:
>
>
>
>> Sundar: Your solution is not only more elegant than mine, it's
>>also faster, at least with this tiny example: > start.time <-
>>proc.time()
>> > k1 <- length(F1)
>> > k2 <- length(F2)
>> > F12.lvls <- unique(c(levels(F1), levels(F2)))
>> > F. <- factor(rep(F12.lvls[1], k1+k1), levels=F12.lvls)
>> > F.[1:k1] <- F1
>> > F.[-(1:k1)] <- F2
>> > proc.time()-start.time
>>[1] 0.00 0.00 0.42 NA NA
>> >
>> > start.time <- proc.time()
>> > F1 <- factor(c("b", "a"))
>> > F2 <- factor(c("c", "b"))
>> > F3 <- factor(c(levels(F1)[F1], levels(F2)[F2]))
>> > proc.time()-start.time
>>[1] 0.00 0.00 0.24 NA NA
>> >
>> With longer vectors, mine may be faster -- but yours is still
>>more elegant. Best Wishes,
>> spencer graves
>>
>>
>
>Actually, Sundars solution is exactly equivalent to the
>
>factor(c(as.character(F1),as.character(F2)))
>
>that several have suggested, and which may actually be good enough for
>the vast majority of cases. It is in fact the same thing that goes on
>inside rbind.data.frame (that uses as.vector, which is equivalent).
>
>If you really want something optimal, in the sense of not allocating a
>large amount of character strings and comparing them individually to
>a joint level set, I think you need something like this:
>
>l1 <- levels(F1)
>l2 <- levels(F2)
>ll <- sort(unique(c(l1, l2)))
>m1 <- match(l1, ll)
>m2 <- match(l2, ll)
>factor(c(m1[F1], m2[F2]), labels=ll)
>
>or if you want to be really hardcore, bypass the inefficiencies inside
>factor() and do
>
>structure(c(m1[F1], m2[F2]), levels=ll, class="factor")
>
>(People have been known to regret coding with explicit calls to
>structure(), though...)
>
>
>
More information about the R-help
mailing list