[R] Intersection of 2 matrices

jim holtman jholtman at gmail.com
Fri Dec 2 21:05:32 CET 2011


Here is one way of doing it:

>    compMat2 <- function(A, B) {  # rows of B present in A
+        B0 <- B[!duplicated(B), ]
+        na <- nrow(A); nb <- nrow(B0)
+        AB <- rbind(A, B0)
+        ab <- duplicated(AB)[(na+1):(na+nb)]
+        return(sum(ab))
+    }
>
>
>    set.seed(8237)
>    A  <- matrix(sample(1:1000, 2*67420, replace=TRUE), 67420, 2)
>    B  <- matrix(sample(1:1000, 2*59199, replace=TRUE), 59199, 2)
>
>    system.time({
+       # convert for comparison
+       A.1 <- apply(A, 1, function(x) paste(x, collapse = ' '))
+       B.1 <- apply(B, 1, function(x) paste(x, collapse = ' '))
+       count <- sum(B.1 %in% A.1)
+    })
   user  system elapsed
   1.77    0.00    1.79
>
>
> count
[1] 3905
>

On Fri, Dec 2, 2011 at 2:46 PM, Hans W Borchers
<hwborchers at googlemail.com> wrote:
> Michael Kao <mkao006rmail <at> gmail.com> writes:
>
>>
> Well, taking a second look, I'd say it depends on the exact formulation.
>
> In the applications I have in mind, I would like to count each occurrence
> in B only once. Perhaps the OP never thought about duplicates in B
>
> Hans Werner
>
>>
>> Here is an example based on the duplicated function
>>
>> test.mat1 <- matrix(1:20, nc = 5)
>>
>> test.mat2 <- rbind(test.mat1[sample(1:5, 2), ], matrix(101:120, nc = 5))
>>
>> compMat <- function(mat1, mat2){
>>      nr1 <- nrow(mat1)
>>      nr2 <- nrow(mat2)
>>      mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ]
>> }
>>
>> compMat(test.mat1, test.mat2)
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.



More information about the R-help mailing list