[R] Sort problem in merge()

Mon Mar 6 22:25:03 CET 2006

Sorry, I mixed up out and outa in the last post.  Here it is correctly.

> levs <- c(LETTERS[1:6], "0")
> tmp1a <- data.frame(col1 = factor(c("A", "A", "C", "C", "0", "0"), levs))
> tmp2a <- data.frame(col1 = factor(c("C", "D", "E", "F"), levs), col2 = 1:4)
>
> out <- merge( cbind(tmp1a, seq = 1:nrow(tmp1a)), tmp2a, all.x = TRUE)
> out <- out[out$seq, -2]
> rownames(out) <- rownames(tmp1a)
> out
  col1 col2
1    A   NA
2    A   NA
3    C    1
4    C    1
5    0   NA
6    0   NA

On 3/6/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> On 3/6/06, Gregor Gorjanc <gregor.gorjanc at gmail.com> wrote:
> >
> > But I want to get out
> >
> > A NA
> > A NA
> > C 1
> > C 1
> > 0 NA
> > 0 NA
> >
>
> That's what I get except for the rownames.  Be sure to
> make the factor levels consistent.  I have renamed the data frames
> tmp1a and tmp2a to distinguish them from the ones in your
> post and have also reset the rownames to be the original
> ones, as requested, so that the following is self contained
> and should be reproducible:
>
> > levs <- c(LETTERS[1:6], "0")
> > tmp1a <- data.frame(col1 = factor(c("A", "A", "C", "C", "0", "0"), levs))
> > tmp2a <- data.frame(col1 = factor(c("C", "D", "E", "F"), levs), col2 = 1:4)
> >
> > outa <- merge( cbind(tmp1a, seq = 1:nrow(tmp1a)), tmp2a, all.x = TRUE)
> > outa <- outa[out$seq, -2]
> > rownames(outa) <- rownames(tmp1a)
> > outa
>  col1 col2
> 1    0   NA
> 2    0   NA
> 3    A   NA
> 4    A   NA
> 5    C    1
> 6    C    1
> >
> > R.version.string # Windows XP
> [1] "R version 2.2.1, 2005-12-20"
>
> By the way, the main limitation with this approach is that the elements of
> tmp2$col1 be unique so that the result has rows which correspond to those
> of tmp1; however, that seems to be the case here.
>