[R] Comparing matrices in R - matrixB %in% matrixA

Charles Novaes de Santana charles.santana at gmail.com
Fri Oct 31 14:39:20 CET 2014


Great!! It is perfect!

Thank you, John, for this elegant and fast suggestion!

Best,

Charles

On Fri, Oct 31, 2014 at 2:35 PM, John Fox <jfox em mcmaster.ca> wrote:

> Dear Charles,
>
> How about the following?
>
> ----------- snip ---------
>
> > AA <- as.list(as.data.frame(t(A)))
> > BB <- as.list(as.data.frame(t(B)))
> > which(AA %in% BB)
> [1] 4 5
>
> ----------- snip ---------
>
> This seems reasonably fast. For example:
>
> ----------- snip ---------
>
> > A <- matrix(1:10000, 10000, 10)
> > B <- A[1:1000, ]
> >
> > system.time({
> +   AA <- as.list(as.data.frame(t(A)))
> +   BB <- as.list(as.data.frame(t(B)))
> +   print(sum(AA %in% BB))
> + })
> [1] 1000
>    user  system elapsed
>    0.26    0.00    0.26
>
> ----------- snip ---------
>
> I hope this helps,
>  John
>
> ------------------------------------------------
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.mcmaster.ca/jfox/
>
>
>
>
> On Fri, 31 Oct 2014 14:20:38 +0100
>  Charles Novaes de Santana <charles.santana em gmail.com> wrote:
> > My apologies, because I sent the message before finishing it. i am very
> > sorry about this. Please find below my message (I use to write the
> messages
> > from the end to the beginning... sorry :)).
> >
> > Dear all,
> >
> > I am trying to compare two matrices, in order to find in which rows of a
> > matrix A I can find the same values as in matrix B. I am trying to do it
> > for matrices with around 2500 elements, but please find below a toy
> example:
> >
> > A = matrix(1:10,nrow=5)
> > B = A[-c(1,2,3),];
> >
> > So
> > > A
> >      [,1] [,2]
> > [1,]    1    6
> > [2,]    2    7
> > [3,]    3    8
> > [4,]    4    9
> > [5,]    5   10
> >
> > and
> > > B
> >      [,1] [,2]
> > [1,]    4    9
> > [2,]    5   10
> >
> > I would like to compare A and B in order to find in which rows of A I can
> > find the  rows of B. Something similar to %in% with one dimensional
> arrays.
> > In the example above, the answer should be 4 and 5.
> >
> > I did a function to do it (see it below), it gives me the correct answer
> > for this toy example, but the excess of for-loops makes it extremely slow
> > for larger matrices. I was wondering if there is a better way to do this
> > kind of comparison. Any idea? Sorry if it is a stupid question.
> >
> > matbinmata<-function(B,A){
> >     res<-c();
> >     rowsB = length(B[,1]);
> >     rowsA = length(A[,1]);
> >     colsB = length(B[1,]);
> >     colsA = length(A[1,]);
> >     for (i in 1:rowsB){
> >         for (j in 1:colsB){
> >             for (k in 1:rowsA){
> >                 for (l in 1:colsA){
> >                     if(A[k,l]==B[i,j]){res<-c(res,k);}
> >                 }
> >             }
> >         }
> >     }
> >     return(unique(sort(res)));
> > }
> >
> >
> > Best,
> >
> > Charles
> >
> > On Fri, Oct 31, 2014 at 2:12 PM, Charles Novaes de Santana <
> > charles.santana em gmail.com> wrote:
> >
> > > A = matrix(1:10,nrow=5)
> > > B = A[-c(1,2,3),];
> > >
> > > So
> > > > A
> > >      [,1] [,2]
> > > [1,]    1    6
> > > [2,]    2    7
> > > [3,]    3    8
> > > [4,]    4    9
> > > [5,]    5   10
> > >
> > > and
> > > > B
> > >      [,1] [,2]
> > > [1,]    4    9
> > > [2,]    5   10
> > >
> > > I would like to compare A and B in order to find in which rows of A I
> can
> > > find the  rows of B. Something similar to %in% with one dimensional
> arrays.
> > > In the example above, the answer should be 4 and 5.
> > >
> > > I did a function to do it (see it below), it gives me the correct
> answer
> > > for this toy example, but the excess of for-loops makes it extremely
> slow
> > > for larger matrices. I was wondering if there is a better way to do
> this
> > > kind of comparison. Any idea? Sorry if it is a stupid question.
> > >
> > > matbinmata<-function(B,A){
> > >     res<-c();
> > >     rowsB = length(B[,1]);
> > >     rowsA = length(A[,1]);
> > >     colsB = length(B[1,]);
> > >     colsA = length(A[1,]);
> > >     for (i in 1:rowsB){
> > >         for (j in 1:colsB){
> > >             for (k in 1:rowsA){
> > >                 for (l in 1:colsA){
> > >                     if(A[k,l]==B[i,j]){res<-c(res,k);}
> > >                 }
> > >             }
> > >         }
> > >     }
> > >     return(unique(sort(res)));
> > > }
> > >
> > >
> > > Best,
> > >
> > > Charles
> > >
> > >
> > > --
> > > Um axé! :)
> > >
> > > --
> > > Charles Novaes de Santana, PhD
> > > http://www.imedea.uib-csic.es/~charles
> > >
> >
> >
> >
> > --
> > Um axé! :)
> >
> > --
> > Charles Novaes de Santana, PhD
> > http://www.imedea.uib-csic.es/~charles
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help em r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Um axé! :)

--
Charles Novaes de Santana, PhD
http://www.imedea.uib-csic.es/~charles

	[[alternative HTML version deleted]]



More information about the R-help mailing list