[R] Comparing matrices in R - matrixB %in% matrixA
John Fox
jfox at mcmaster.ca
Fri Oct 31 15:40:10 CET 2014
Dear Jeff,
For curiosity, I compared your solution with the one I posted earlier this morning (when I was working on a slower computer, accounting for the somewhat different timings for my solution):
------------ snip ----------
> A <- matrix(1:10000, 10000, 10)
> B <- A[1:1000, ]
>
> system.time({
+ AA <- as.list(as.data.frame(t(A)))
+ BB <- as.list(as.data.frame(t(B)))
+ print(sum(AA %in% BB))
+ })
[1] 1000
user system elapsed
0.14 0.01 0.16
>
>
> system.time({
+ lresult <- rep( NA, nrow(A) )
+ for ( ia in seq.int( nrow( A ) ) ) {
+ lres <- FALSE
+ ib <- 0
+ while ( ib < nrow( B ) & !lres ) {
+ ib <- ib + 1
+ lres <- all( A[ ia, ] == B[ ib, ] )
+ }
+ lresult[ ia ] <- lres
+ }
+ print(sum( lresult ))
+ })
[1] 1000
user system elapsed
45.76 0.01 45.77
> 46/0.16
[1] 287.5
------------ snip ----------
So the solution using nested loops is more than 2 orders of magnitude slower for this problem. Of course, for a one-off problem, depending on its size, the difference may not matter.
Best,
John
-----------------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.socsci.mcmaster.ca/jfox/
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of Jeff Newmiller
> Sent: Friday, October 31, 2014 10:15 AM
> To: Charles Novaes de Santana; r-help at r-project.org
> Subject: Re: [R] Comparing matrices in R - matrixB %in% matrixA
>
> Thank you for the reproducible example, but posting in HTML can corrupt
> your example code so please learn to set your email client mail format
> appropriately when posting to this list.
>
> I think this [1] post, found with a quick Google search for "R match
> matrix", fits your situation perfectly.
>
> match(data.frame(t(B)), data.frame(t(A)))
>
> Note that concatenating vectors in loops is bad news... a basic
> optimization for your code would be to preallocate a logical result
> vector and fill in each element with a TRUE/FALSE in the outer loop,
> and use the which() function on that completed vector to identify the
> index numbers (if you really need that). For example:
>
> lresult <- rep( NA, nrow(A) )
> for ( ia in seq.int( nrow( A ) ) ) {
> lres <- FALSE
> ib <- 0
> while ( ib < nrow( B ) & !lres ) {
> ib <- ib + 1
> lres <- all( A[ ia, ] == B[ ib, ] )
> }
> lresult[ ia ] <- lres
> }
> result <- which( lresult )
>
> [1] http://stackoverflow.com/questions/12697122/in-r-match-function-
> for-rows-or-columns-of-matrix
> -----------------------------------------------------------------------
> ----
> Jeff Newmiller The ..... ..... Go
> Live...
> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
> Go...
> Live: OO#.. Dead: OO#..
> Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#.
> rocks...1k
> -----------------------------------------------------------------------
> ----
> Sent from my phone. Please excuse my brevity.
>
> On October 31, 2014 6:20:38 AM PDT, Charles Novaes de Santana
> <charles.santana at gmail.com> wrote:
> >My apologies, because I sent the message before finishing it. i am
> very
> >sorry about this. Please find below my message (I use to write the
> >messages
> >from the end to the beginning... sorry :)).
> >
> >Dear all,
> >
> >I am trying to compare two matrices, in order to find in which rows of
> >a
> >matrix A I can find the same values as in matrix B. I am trying to do
> >it
> >for matrices with around 2500 elements, but please find below a toy
> >example:
> >
> >A = matrix(1:10,nrow=5)
> >B = A[-c(1,2,3),];
> >
> >So
> >> A
> > [,1] [,2]
> >[1,] 1 6
> >[2,] 2 7
> >[3,] 3 8
> >[4,] 4 9
> >[5,] 5 10
> >
> >and
> >> B
> > [,1] [,2]
> >[1,] 4 9
> >[2,] 5 10
> >
> >I would like to compare A and B in order to find in which rows of A I
> >can
> >find the rows of B. Something similar to %in% with one dimensional
> >arrays.
> >In the example above, the answer should be 4 and 5.
> >
> >I did a function to do it (see it below), it gives me the correct
> >answer
> >for this toy example, but the excess of for-loops makes it extremely
> >slow
> >for larger matrices. I was wondering if there is a better way to do
> >this
> >kind of comparison. Any idea? Sorry if it is a stupid question.
> >
> >matbinmata<-function(B,A){
> > res<-c();
> > rowsB = length(B[,1]);
> > rowsA = length(A[,1]);
> > colsB = length(B[1,]);
> > colsA = length(A[1,]);
> > for (i in 1:rowsB){
> > for (j in 1:colsB){
> > for (k in 1:rowsA){
> > for (l in 1:colsA){
> > if(A[k,l]==B[i,j]){res<-c(res,k);}
> > }
> > }
> > }
> > }
> > return(unique(sort(res)));
> >}
> >
> >
> >Best,
> >
> >Charles
> >
> >On Fri, Oct 31, 2014 at 2:12 PM, Charles Novaes de Santana <
> >charles.santana at gmail.com> wrote:
> >
> >> A = matrix(1:10,nrow=5)
> >> B = A[-c(1,2,3),];
> >>
> >> So
> >> > A
> >> [,1] [,2]
> >> [1,] 1 6
> >> [2,] 2 7
> >> [3,] 3 8
> >> [4,] 4 9
> >> [5,] 5 10
> >>
> >> and
> >> > B
> >> [,1] [,2]
> >> [1,] 4 9
> >> [2,] 5 10
> >>
> >> I would like to compare A and B in order to find in which rows of A
> I
> >can
> >> find the rows of B. Something similar to %in% with one dimensional
> >arrays.
> >> In the example above, the answer should be 4 and 5.
> >>
> >> I did a function to do it (see it below), it gives me the correct
> >answer
> >> for this toy example, but the excess of for-loops makes it extremely
> >slow
> >> for larger matrices. I was wondering if there is a better way to do
> >this
> >> kind of comparison. Any idea? Sorry if it is a stupid question.
> >>
> >> matbinmata<-function(B,A){
> >> res<-c();
> >> rowsB = length(B[,1]);
> >> rowsA = length(A[,1]);
> >> colsB = length(B[1,]);
> >> colsA = length(A[1,]);
> >> for (i in 1:rowsB){
> >> for (j in 1:colsB){
> >> for (k in 1:rowsA){
> >> for (l in 1:colsA){
> >> if(A[k,l]==B[i,j]){res<-c(res,k);}
> >> }
> >> }
> >> }
> >> }
> >> return(unique(sort(res)));
> >> }
> >>
> >>
> >> Best,
> >>
> >> Charles
> >>
> >>
> >> --
> >> Um axé! :)
> >>
> >> --
> >> Charles Novaes de Santana, PhD
> >> http://www.imedea.uib-csic.es/~charles
> >>
> >
> >
> >
> >--
> >Um axé! :)
> >
> >--
> >Charles Novaes de Santana, PhD
> >http://www.imedea.uib-csic.es/~charles
> >
> > [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help at r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list