[R] merging

Marc Schwartz (via MN) mschwartz at mn.rr.com
Tue May 30 22:38:17 CEST 2006


On Tue, 2006-05-30 at 19:08 +0100, Gavin Simpson wrote:
> Dear List,
> 
> Given,
> 
> y <- matrix(c(0,1,1,1,0,0,0,4,4), ncol = 3, byrow = TRUE)
> rownames(y) <- c("a","b","c")
> colnames(y) <- c("1","2","3")
> y
> y2 <- y[2:3, ]
> rownames(y2) <- c("x","z")
> y2
> 
> how can I stop
> 
> merge(y, y2, all = TRUE, sort = FALSE)
> 
> squishing the extra rows? Ideally I want the same as:
> 
> rbind(y, y2)
> 
> in this case. This is specific example of situation where two data
> matrices have same column variables and all I want is to stick the two
> sets of rows together, but I have been using merge for cases such as the
> one below, where the second matrix has extra column(s):
> 
> y3 <- matrix(c(0,1,1,1,0,0,0,4,4,5,6,7), ncol = 4, byrow = TRUE)
> rownames(y3) <- c("d","e","f")
> colnames(y3) <- c("1","2","3","4")
> y3
> merge(y, y3, all = TRUE, sort = FALSE)
> 
> We don't know before hand if the columns will match. But I see now that
> even this doesn't work as I was expecting/thinking!
> 
> So I'm looking for a general way to merge two matrices such that the
> number of rows in the merged matrix is nrow(mat1) + nrow(mat2) and the
> number of columns in the merged matrix is length(unique(colnames(mat1),
> colnames(mat2).
> 
> Is there a function in R to do this, or can someone suggest a way to
> achieve this? My R version info is at the end.
> 
> Just to be clear, for the y, y3 example I want something like this
> returned:
> 
>   1 2 3 4
> a 0 1 1 NA
> b 1 0 0 NA
> c 0 4 4 NA
> d 0 1 1 1
> e 0 0 0 4
> f 4 5 6 7
> 
> and for the y, y2 example, I want something like this returned:
> 
>   1 2 3
> a 0 1 1
> b 1 0 0
> c 0 4 4
> x 1 0 0
> z 0 4 4

Gavin,

Here is a possible solution, though not fully tested.

It uses the "row.names" for the two matrices as part of the 'by'
matching process. This is noted in the "Details" section in ?merge.

So for y and y2:

> res <- merge(y, y2, 
               by = c("row.names", intersect(colnames(y),
                      colnames(y2))), 
               all = TRUE)

# Note that the row names are now the first col
> res
  Row.names 1 2 3
1         a 0 1 1
2         b 1 0 0
3         c 0 4 4
4         x 1 0 0
5         z 0 4 4

# Subset res, leaving out the first col
> mat <- res[, -1]

# Set the rownames from res
> rownames(mat) <- res[, 1]

> mat
  1 2 3
a 0 1 1
b 1 0 0
c 0 4 4
x 1 0 0
z 0 4 4



More information about the R-help mailing list