[R] merging
Marc Schwartz (via MN)
mschwartz at mn.rr.com
Tue May 30 22:43:16 CEST 2006
On Tue, 2006-05-30 at 15:38 -0500, Marc Schwartz (via MN) wrote:
> On Tue, 2006-05-30 at 19:08 +0100, Gavin Simpson wrote:
> > Dear List,
> >
> > Given,
> >
> > y <- matrix(c(0,1,1,1,0,0,0,4,4), ncol = 3, byrow = TRUE)
> > rownames(y) <- c("a","b","c")
> > colnames(y) <- c("1","2","3")
> > y
> > y2 <- y[2:3, ]
> > rownames(y2) <- c("x","z")
> > y2
> >
> > how can I stop
> >
> > merge(y, y2, all = TRUE, sort = FALSE)
> >
> > squishing the extra rows? Ideally I want the same as:
> >
> > rbind(y, y2)
> >
> > in this case. This is specific example of situation where two data
> > matrices have same column variables and all I want is to stick the two
> > sets of rows together, but I have been using merge for cases such as the
> > one below, where the second matrix has extra column(s):
> >
> > y3 <- matrix(c(0,1,1,1,0,0,0,4,4,5,6,7), ncol = 4, byrow = TRUE)
> > rownames(y3) <- c("d","e","f")
> > colnames(y3) <- c("1","2","3","4")
> > y3
> > merge(y, y3, all = TRUE, sort = FALSE)
> >
> > We don't know before hand if the columns will match. But I see now that
> > even this doesn't work as I was expecting/thinking!
> >
> > So I'm looking for a general way to merge two matrices such that the
> > number of rows in the merged matrix is nrow(mat1) + nrow(mat2) and the
> > number of columns in the merged matrix is length(unique(colnames(mat1),
> > colnames(mat2).
> >
> > Is there a function in R to do this, or can someone suggest a way to
> > achieve this? My R version info is at the end.
> >
> > Just to be clear, for the y, y3 example I want something like this
> > returned:
> >
> > 1 2 3 4
> > a 0 1 1 NA
> > b 1 0 0 NA
> > c 0 4 4 NA
> > d 0 1 1 1
> > e 0 0 0 4
> > f 4 5 6 7
> >
> > and for the y, y2 example, I want something like this returned:
> >
> > 1 2 3
> > a 0 1 1
> > b 1 0 0
> > c 0 4 4
> > x 1 0 0
> > z 0 4 4
>
> Gavin,
>
> Here is a possible solution, though not fully tested.
>
> It uses the "row.names" for the two matrices as part of the 'by'
> matching process. This is noted in the "Details" section in ?merge.
>
> So for y and y2:
>
> > res <- merge(y, y2,
> by = c("row.names", intersect(colnames(y),
> colnames(y2))),
> all = TRUE)
>
> # Note that the row names are now the first col
> > res
> Row.names 1 2 3
> 1 a 0 1 1
> 2 b 1 0 0
> 3 c 0 4 4
> 4 x 1 0 0
> 5 z 0 4 4
>
> # Subset res, leaving out the first col
> > mat <- res[, -1]
>
> # Set the rownames from res
> > rownames(mat) <- res[, 1]
>
> > mat
> 1 2 3
> a 0 1 1
> b 1 0 0
> c 0 4 4
> x 1 0 0
> z 0 4 4
Ack...hit the wrong button. Sorry.
Must be the long weekend....yeah, that's my story and I'm sticking to
it... ;-)
Here is the solution for y and y3:
> res2 <- merge(y, y3,
by = c("row.names", intersect(colnames(y),
colnames(y3))),
all = TRUE)
> res2
Row.names 1 2 3 4
1 a 0 1 1 NA
2 b 1 0 0 NA
3 c 0 4 4 NA
4 d 0 1 1 1
5 e 0 0 0 4
6 f 4 5 6 7
> mat2 <- res2[, -1]
> rownames(mat2) <- res2[, 1]
> mat2
1 2 3 4
a 0 1 1 NA
b 1 0 0 NA
c 0 4 4 NA
d 0 1 1 1
e 0 0 0 4
f 4 5 6 7
HTH,
Marc Schwartz
More information about the R-help
mailing list