[R] Intersection of 2 matrices
Hans W Borchers
hwborchers at googlemail.com
Fri Dec 2 20:22:31 CET 2011
Michael Kao <mkao006rmail <at> gmail.com> writes:
>
Your solution is fast, but not completely correct, because you are also
counting possible duplicates within the second matrix. The 'refitted'
function could look as follows:
compMat2 <- function(A, B) { # rows of B present in A
B0 <- B[!duplicated(B), ]
na <- nrow(A); nb <- nrow(B0)
AB <- rbind(A, B0)
ab <- duplicated(AB)[(na+1):(na+nb)]
return(sum(ab))
}
and testing an example the size the OR was asking for:
set.seed(8237)
A <- matrix(sample(1:1000, 2*67420, replace=TRUE), 67420, 2)
B <- matrix(sample(1:1000, 2*59199, replace=TRUE), 59199, 2)
system.time(n <- compMat2(A, B)) # n = 3790
while compMat() will return 5522 rows, with 1732 duplicates within B !
A 3.06 GHz iMac needs about 2 -- 2.5 seconds.
Hans Werner
> On 2/12/2011 2:48 p.m., David Winsemius wrote:
> >
> > On Dec 2, 2011, at 4:20 AM, oluwole oyebamiji wrote:
> >
> >> Hi all,
> >> I have matrix A of 67420 by 2 and another matrix B of 59199 by 2.
> >> I would like to find the number of rows of matrix B that I can find
> >> in matrix A (rows that are common to both matrices with or without
> >> sorting).
> >>
> >> I have tried the "intersection" and "is.element" functions in R but
> >> it only working for the vectors and not matrix
> >> i.e, intersection(A,B) and is.element(A,B).
> >
> > Have you considered the 'duplicated' function?
> >
>
> Here is an example based on the duplicated function
>
> test.mat1 <- matrix(1:20, nc = 5)
>
> test.mat2 <- rbind(test.mat1[sample(1:5, 2), ], matrix(101:120, nc = 5))
>
> compMat <- function(mat1, mat2){
> nr1 <- nrow(mat1)
> nr2 <- nrow(mat2)
> mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ]
> }
>
> compMat(test.mat1, test.mat2)
>
>
More information about the R-help
mailing list