[R] Intersection of 2 matrices
    jim holtman 
    jholtman at gmail.com
       
    Fri Dec  2 21:05:32 CET 2011
    
    
  
Here is one way of doing it:
>    compMat2 <- function(A, B) {  # rows of B present in A
+        B0 <- B[!duplicated(B), ]
+        na <- nrow(A); nb <- nrow(B0)
+        AB <- rbind(A, B0)
+        ab <- duplicated(AB)[(na+1):(na+nb)]
+        return(sum(ab))
+    }
>
>
>    set.seed(8237)
>    A  <- matrix(sample(1:1000, 2*67420, replace=TRUE), 67420, 2)
>    B  <- matrix(sample(1:1000, 2*59199, replace=TRUE), 59199, 2)
>
>    system.time({
+       # convert for comparison
+       A.1 <- apply(A, 1, function(x) paste(x, collapse = ' '))
+       B.1 <- apply(B, 1, function(x) paste(x, collapse = ' '))
+       count <- sum(B.1 %in% A.1)
+    })
   user  system elapsed
   1.77    0.00    1.79
>
>
> count
[1] 3905
>
On Fri, Dec 2, 2011 at 2:46 PM, Hans W Borchers
<hwborchers at googlemail.com> wrote:
> Michael Kao <mkao006rmail <at> gmail.com> writes:
>
>>
> Well, taking a second look, I'd say it depends on the exact formulation.
>
> In the applications I have in mind, I would like to count each occurrence
> in B only once. Perhaps the OP never thought about duplicates in B
>
> Hans Werner
>
>>
>> Here is an example based on the duplicated function
>>
>> test.mat1 <- matrix(1:20, nc = 5)
>>
>> test.mat2 <- rbind(test.mat1[sample(1:5, 2), ], matrix(101:120, nc = 5))
>>
>> compMat <- function(mat1, mat2){
>>      nr1 <- nrow(mat1)
>>      nr2 <- nrow(mat2)
>>      mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ]
>> }
>>
>> compMat(test.mat1, test.mat2)
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
    
    
More information about the R-help
mailing list