# [R] Problems with merge

Prof Brian D Ripley ripley at stats.ox.ac.uk
Sat Mar 16 08:33:25 CET 2002

On Sat, 16 Mar 2002 ggrothendieck at yifan.net wrote:

> > # this works ok
> > data(iris)
> > merge(iris[1,],iris)
>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
> 1          5.1         3.5          1.4         0.2  setosa
> >
> > # but try the same thing with this data frame
> > d.df <- data.frame(x=1:3,y=c("A","D","E"),z=c(6,9,10))
> > d.df
>   x y  z
> 1 1 A  6
> 2 2 D  9
> 3 3 E 10
> >
> > # this results in zero rows whereas it should have 1 !!!
> > merge(d.df[1,],d.df)
> [1] x y z
> <0 rows> (or 0-length row.names)
> >
> > # remove last case and it suddenly works OK
> > merge(d.df[1,],d.df[-3,])
>   x y z
> 1 1 A 6
> >
> > # or remove z and it suddenly works OK
> > merge(d.df[1,-3],d.df[,-3])
>   x y
> 1 1 A

This is a bug.  Look at

> d.df[1,]
x y z
1 1 A 6

to see why: In d.df the last column has leading spaces.

> > # Here is another different problem with merge
> > # we specified no sorting but the rows are not in the same order as e.df

Do read the help page. This is sorted in the order specifed there:

The rows are by default lexicographically sorted on
the common columns, but are otherwise in the order in which they
occurred in `x'.

> > e.df <- data.frame(x=c(1,4,5,1,3,5),y=c("A","D","E","A","C","E"),z=c(6,9,10,6,8,10))
> > e.df
>   x y  z
> 1 1 A  6
> 2 4 D  9
> 3 5 E 10
> 4 1 A  6
> 5 3 C  8
> 6 5 E 10
> > merge(e.df,unique(e.df),sort=F)
>   x y  z
> 1 1 A  6
> 2 1 A  6
> 3 4 D  9
> 4 5 E 10
> 5 5 E 10
> 6 3 C  8

