[R] How to efficiently compare each row in a matrix with each row in another matrix?

arun smartpink111 at yahoo.com
Mon Dec 10 18:21:35 CET 2012


HI Jonathan,
Tested your code from Nabble:

Looks like your solution is the fastest, but:
N <- 1000
M <- 5
P <- 5000
set.seed(15)
A <- matrix(runif(N,1,1000),nrow=N,ncol=M)
set.seed(425)
B <- matrix(runif(M,1,1000),nrow=P,ncol=M)

library(matrixStats)
Marius.5.0 <- function(A,B) outer(rowMaxs(A),rowMins(B),'<') #Jonathan's code
 system.time(z5.0 <- Marius.5.0(A,B)) 
#   user  system elapsed 
#  0.280   0.040   0.321 
Marius.4.0 <- function(A, B) apply(B, 1, function(x) colSums(x>=t(A))==ncol(A))
 system.time(z4.0 <- Marius.4.0(A,B))
#   user  system elapsed 
#  0.460   0.044   0.506 
 identical(z5.0,z4.0)
#[1] TRUE

# when I test it with the toy example:
A<-matrix(c(1:4,6,2),ncol=2,byrow=TRUE)
B <- matrix(1:10, ncol=2) # (5, 2) matrix
 Marius.5.0(A,B)
#      [,1]  [,2]  [,3]  [,4]  [,5]
#[1,] FALSE FALSE  TRUE  TRUE  TRUE
#[2,] FALSE FALSE FALSE FALSE  TRUE
#[3,] FALSE FALSE FALSE FALSE FALSE
 Marius.4.0(A,B)
#      [,1]  [,2]  [,3]  [,4]  [,5]
#[1,]  TRUE  TRUE  TRUE  TRUE  TRUE
#[2,] FALSE FALSE  TRUE  TRUE  TRUE
#[3,] FALSE FALSE FALSE FALSE FALSE
 identical(Marius.5.0(A,B),Marius.4.0(A,B))
#[1] FALSE

A.K.








----- Original Message -----
From: Hofert  Jan Marius <marius.hofert at math.ethz.ch>
To: arun <smartpink111 at yahoo.com>
Cc: Thomas Stewart <tgs.public.mail at gmail.com>; "mailman, r-help" <r-help at r-project.org>
Sent: Saturday, December 8, 2012 2:15 PM
Subject: RE: [R] How to efficiently compare each row in a matrix with each row in another matrix?

The idea is good, but you don't need to create a list of the rows of A first, apply does the job:

Marius.4.0 <- function(A, B)
    apply(B, 1, function(x) colSums(x>=t(A))==ncol(A))

That was actually a bit faster than your version. 

This is the fastest version so far. I compared it with C code called via .C: C was 15% faster.

Cheers,

Marius


________________________________________
From: arun [smartpink111 at yahoo.com]
Sent: Saturday, December 08, 2012 7:43 PM
To: Hofert  Jan Marius
Cc: Thomas Stewart; mailman, r-help
Subject: Re: [R] How to efficiently compare each row in a matrix with each row in another matrix?

Hi,

Just to add:
N <- 1000
M <- 5
P <- 5000
set.seed(15)
A <- matrix(runif(N,1,1000),nrow=N,ncol=M)
set.seed(425)
B <- matrix(runif(M,1,1000),nrow=P,ncol=M)

Marius.3.0<-function(A,B){do.call(cbind,lapply(split(B,row(B)),function(x) colSums(x>=t(A))==ncol(A)))}
Marius.2.0 <- function(A, B){
    nA <- nrow(A)
    nB <- nrow(B)
    C <- do.call(rbind, rep(list(B), nA)) >= matrix(rep(A, each=nB), ncol=ncol(B))
    matrix(rowSums(C) == ncol(A), nA, nB, byrow=TRUE)
}

system.time(z3.0<-Marius.3.0(A,B))
#   user  system elapsed
# 0.524   0.020   0.548
system.time(z2.0<-Marius.2.0(A,B))
#   user  system elapsed
# 0.968   0.216   1.189
system.time(z1<-perhaps(A,B))
#   user  system elapsed
# 1.264   0.204   1.473

attr(z3.0,"dim")<-dim(z2.0)
identical(z3.0,z2.0)
#[1] TRUE
identical(z1,z3.0)
#[1] TRUE

A.K.



----- Original Message -----
From: Marius Hofert <marius.hofert at math.ethz.ch>
To: R-help <r-help at r-project.org>
Cc:
Sent: Saturday, December 8, 2012 6:28 AM
Subject: [R] How to efficiently compare each row in a matrix with each row in another matrix?

Dear expeRts,

I have two matrices A and B. They have the same number of columns but possibly different number of rows. I would like to compare each row of A with each row of B and check whether all entries in a row of A are less than or equal to all entries in a row of B. Here is a minimal working example:

A <- rbind(matrix(1:4, ncol=2, byrow=TRUE), c(6, 2)) # (3, 2) matrix
B <- matrix(1:10, ncol=2) # (5, 2) matrix
( ind <- apply(B, 1, function(b) apply(A, 1, function(a) all(a <= b))) ) # (3, 5) = (nrow(A), nrow(B)) matrix

The question is: How can this be implemented more efficiently in R, that is, in a faster way?

Thanks & cheers,

Marius

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list