[R] Intersection of 2 matrices
jim holtman
jholtman at gmail.com
Fri Dec 2 21:05:32 CET 2011
Here is one way of doing it:
> compMat2 <- function(A, B) { # rows of B present in A
+ B0 <- B[!duplicated(B), ]
+ na <- nrow(A); nb <- nrow(B0)
+ AB <- rbind(A, B0)
+ ab <- duplicated(AB)[(na+1):(na+nb)]
+ return(sum(ab))
+ }
>
>
> set.seed(8237)
> A <- matrix(sample(1:1000, 2*67420, replace=TRUE), 67420, 2)
> B <- matrix(sample(1:1000, 2*59199, replace=TRUE), 59199, 2)
>
> system.time({
+ # convert for comparison
+ A.1 <- apply(A, 1, function(x) paste(x, collapse = ' '))
+ B.1 <- apply(B, 1, function(x) paste(x, collapse = ' '))
+ count <- sum(B.1 %in% A.1)
+ })
user system elapsed
1.77 0.00 1.79
>
>
> count
[1] 3905
>
On Fri, Dec 2, 2011 at 2:46 PM, Hans W Borchers
<hwborchers at googlemail.com> wrote:
> Michael Kao <mkao006rmail <at> gmail.com> writes:
>
>>
> Well, taking a second look, I'd say it depends on the exact formulation.
>
> In the applications I have in mind, I would like to count each occurrence
> in B only once. Perhaps the OP never thought about duplicates in B
>
> Hans Werner
>
>>
>> Here is an example based on the duplicated function
>>
>> test.mat1 <- matrix(1:20, nc = 5)
>>
>> test.mat2 <- rbind(test.mat1[sample(1:5, 2), ], matrix(101:120, nc = 5))
>>
>> compMat <- function(mat1, mat2){
>> nr1 <- nrow(mat1)
>> nr2 <- nrow(mat2)
>> mat2[duplicated(rbind(mat1, mat2))[(nr1 + 1):(nr1 + nr2)], ]
>> }
>>
>> compMat(test.mat1, test.mat2)
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
More information about the R-help
mailing list