[R] Indexing matrices from the Matrix package with [i, j] seems to be very slow. Are there "faster alternatives"?

Søren Højsgaard sorenh at math.aau.dk
Wed Jun 27 01:44:32 CEST 2012


Duncan,
I should probably add that I am aware that my code is not the solution and also that the relative gain of my code probably decreases with the problem size until eventually it will perform worse that [i,j] (because of copying I suppose). So my point is just:  It would just be nice if [i,j] was faster...
Regards
Søren

PS: For a 2000 x 2000 matrix I get:
                  test replications elapsed  relative
1   lookup(mm, `[`)              5    14.85 1.000000
2 lookup(MM, Xiijj)            5  133.66 9.000673

Using the modified code

src <- '
using namespace Rcpp;
typedef Eigen::MappedSparseMatrix<double> MSpMat;
const MSpMat X(as<MSpMat>(XX_));
int i = as<int>(ii_)-1;
int j = as<int>(jj_)-1;
double ans = X.coeff(i,j);
return(wrap(ans));
'





-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Søren Højsgaard
Sent: 27. juni 2012 01:20
To: Duncan Murdoch
Cc: r-help at r-project.org
Subject: Re: [R] Indexing matrices from the Matrix package with [i, j] seems to be very slow. Are there "faster alternatives"?

Dear Duncan

Thanks for your suggestion, but I really need sparse matrices: I have implemented various graph algorithms based on adjacency matrices. For large graphs, storing all the 0's in an adjacency matrices become "uneconomical", and therefore I thought I would use sparse matrices but the speed of "[i,j]" will slow down the algorithms. However, using RcppEigen it is possible to mimic "[i,j]" with a slowdown of "only" a factor 16 which is much better than what is obtained when using "[i,j]":

> benchmark(lookup(mm,`[`), lookup(MM,`[`), lookup(MM, Xiijj),
+ columns=c("test", "replications", "elapsed", "relative"), 
+ replications=5)
               test replications elapsed relative
1   lookup(mm, `[`)            5    0.05      1.0
2   lookup(MM, `[`)            5   23.54    470.8
3 lookup(MM, Xiijj)            5    0.84     16.8

The code for producing the result is given below.

Best regards,
Søren

---------------------

library(inline)
library(RcppEigen)
library(rbenchmark)
library(Matrix)

src <- '
using namespace Rcpp;
typedef Eigen::SparseMatrix<double> MSpMat; const MSpMat X(as<MSpMat>(XX_)); int i = as<int>(ii_)-1; int j = as<int>(jj_)-1; double ans = X.coeff(i,j); return(wrap(ans)); '

Xiijj <- cxxfunction(signature(XX_="matrix", ii_="integer", jj_="integer"), 
	body=src, plugin="RcppEigen")
	
mm <- matrix(c(1,0,0,0,0,0,0,0), nr=100, nc=100) MM <- as(mm, "Matrix")
object.size(mm)
object.size(MM)

lookup <- function(mat, func){
  for (i in 1:nrow(mat)){
	for (j in 1:ncol(mat)){
		v<-func(mat,i,j)
	}
   }
}

benchmark(lookup(mm,`[`), lookup(MM,`[`), lookup(MM, Xiijj),
	columns=c("test", "replications", "elapsed", "relative"), replications=5)

--------------------------------









-----Original Message-----
From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com]
Sent: 25. juni 2012 11:27
To: Søren Højsgaard
Cc: r-help at r-project.org
Subject: Re: [R] Indexing matrices from the Matrix package with [i, j] seems to be very slow. Are there "faster alternatives"?

On 12-06-24 4:50 PM, Søren Højsgaard wrote:
> Dear all,
>
> Indexing matrices from the Matrix package with [i,j] seems to be very slow. For example:
>
>   library(rbenchmark)
>   library(Matrix)
>   mm<- matrix(c(1,0,0,0,0,0,0,0), nr=20, nc=20)
>   MM<- as(mm, "Matrix")
>   lookup<- function(mat){
>     for (i in 1:nrow(mat)){
>       for (j in 1:ncol(mat)){
>          mat[i,j]
>       }
>     }
> }
>
>   benchmark(lookup(mm), lookup(MM),  columns=c("test", "replications", "elapsed", "relative"), replications=50)
>         test     replications elapsed relative
> 1 lookup(mm)           50    0.01           1
> 2 lookup(MM)           50    8.77      877
>
> I would have expected a small overhead when indexing a matrix from the Matrix package, but this result is really surprising...
> Does anybody know if there are faster alternatives to [i,j] ?

There's also a large overhead when indexing a dataframe, though Matrix appears to be slower.  It's designed to work on whole matrices at a time, not single entries.  So I'd suggest that if you need to use [i,j] indexing, then try to arrange your code to localize the access, and extract a submatrix as a regular fast matrix first. (Or if it will fit in memory, convert the whole thing to a matrix just for the access.  If I just add the line

mat <- as.matrix(mat)

at the start of your lookup function, it becomes several hundred times
faster.)

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list