[R] Comparing matrices in R - matrixB %in% matrixA
John Fox
jfox at mcmaster.ca
Fri Oct 31 14:35:06 CET 2014
Dear Charles,
How about the following?
----------- snip ---------
> AA <- as.list(as.data.frame(t(A)))
> BB <- as.list(as.data.frame(t(B)))
> which(AA %in% BB)
[1] 4 5
----------- snip ---------
This seems reasonably fast. For example:
----------- snip ---------
> A <- matrix(1:10000, 10000, 10)
> B <- A[1:1000, ]
>
> system.time({
+ AA <- as.list(as.data.frame(t(A)))
+ BB <- as.list(as.data.frame(t(B)))
+ print(sum(AA %in% BB))
+ })
[1] 1000
user system elapsed
0.26 0.00 0.26
----------- snip ---------
I hope this helps,
John
------------------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/
On Fri, 31 Oct 2014 14:20:38 +0100
Charles Novaes de Santana <charles.santana at gmail.com> wrote:
> My apologies, because I sent the message before finishing it. i am very
> sorry about this. Please find below my message (I use to write the messages
> from the end to the beginning... sorry :)).
>
> Dear all,
>
> I am trying to compare two matrices, in order to find in which rows of a
> matrix A I can find the same values as in matrix B. I am trying to do it
> for matrices with around 2500 elements, but please find below a toy example:
>
> A = matrix(1:10,nrow=5)
> B = A[-c(1,2,3),];
>
> So
> > A
> [,1] [,2]
> [1,] 1 6
> [2,] 2 7
> [3,] 3 8
> [4,] 4 9
> [5,] 5 10
>
> and
> > B
> [,1] [,2]
> [1,] 4 9
> [2,] 5 10
>
> I would like to compare A and B in order to find in which rows of A I can
> find the rows of B. Something similar to %in% with one dimensional arrays.
> In the example above, the answer should be 4 and 5.
>
> I did a function to do it (see it below), it gives me the correct answer
> for this toy example, but the excess of for-loops makes it extremely slow
> for larger matrices. I was wondering if there is a better way to do this
> kind of comparison. Any idea? Sorry if it is a stupid question.
>
> matbinmata<-function(B,A){
> res<-c();
> rowsB = length(B[,1]);
> rowsA = length(A[,1]);
> colsB = length(B[1,]);
> colsA = length(A[1,]);
> for (i in 1:rowsB){
> for (j in 1:colsB){
> for (k in 1:rowsA){
> for (l in 1:colsA){
> if(A[k,l]==B[i,j]){res<-c(res,k);}
> }
> }
> }
> }
> return(unique(sort(res)));
> }
>
>
> Best,
>
> Charles
>
> On Fri, Oct 31, 2014 at 2:12 PM, Charles Novaes de Santana <
> charles.santana at gmail.com> wrote:
>
> > A = matrix(1:10,nrow=5)
> > B = A[-c(1,2,3),];
> >
> > So
> > > A
> > [,1] [,2]
> > [1,] 1 6
> > [2,] 2 7
> > [3,] 3 8
> > [4,] 4 9
> > [5,] 5 10
> >
> > and
> > > B
> > [,1] [,2]
> > [1,] 4 9
> > [2,] 5 10
> >
> > I would like to compare A and B in order to find in which rows of A I can
> > find the rows of B. Something similar to %in% with one dimensional arrays.
> > In the example above, the answer should be 4 and 5.
> >
> > I did a function to do it (see it below), it gives me the correct answer
> > for this toy example, but the excess of for-loops makes it extremely slow
> > for larger matrices. I was wondering if there is a better way to do this
> > kind of comparison. Any idea? Sorry if it is a stupid question.
> >
> > matbinmata<-function(B,A){
> > res<-c();
> > rowsB = length(B[,1]);
> > rowsA = length(A[,1]);
> > colsB = length(B[1,]);
> > colsA = length(A[1,]);
> > for (i in 1:rowsB){
> > for (j in 1:colsB){
> > for (k in 1:rowsA){
> > for (l in 1:colsA){
> > if(A[k,l]==B[i,j]){res<-c(res,k);}
> > }
> > }
> > }
> > }
> > return(unique(sort(res)));
> > }
> >
> >
> > Best,
> >
> > Charles
> >
> >
> > --
> > Um axé! :)
> >
> > --
> > Charles Novaes de Santana, PhD
> > http://www.imedea.uib-csic.es/~charles
> >
>
>
>
> --
> Um axé! :)
>
> --
> Charles Novaes de Santana, PhD
> http://www.imedea.uib-csic.es/~charles
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list