[R] Indexing matrices from the Matrix package with [i, j] seems to be very slow. Are there "faster alternatives"?
Søren Højsgaard
sorenh at math.aau.dk
Wed Jun 27 01:19:40 CEST 2012
Dear Duncan
Thanks for your suggestion, but I really need sparse matrices: I have implemented various graph algorithms based on adjacency matrices. For large graphs, storing all the 0's in an adjacency matrices become "uneconomical", and therefore I thought I would use sparse matrices but the speed of "[i,j]" will slow down the algorithms. However, using RcppEigen it is possible to mimic "[i,j]" with a slowdown of "only" a factor 16 which is much better than what is obtained when using "[i,j]":
> benchmark(lookup(mm,`[`), lookup(MM,`[`), lookup(MM, Xiijj),
+ columns=c("test", "replications", "elapsed", "relative"), replications=5)
test replications elapsed relative
1 lookup(mm, `[`) 5 0.05 1.0
2 lookup(MM, `[`) 5 23.54 470.8
3 lookup(MM, Xiijj) 5 0.84 16.8
The code for producing the result is given below.
Best regards,
Søren
---------------------
library(inline)
library(RcppEigen)
library(rbenchmark)
library(Matrix)
src <- '
using namespace Rcpp;
typedef Eigen::SparseMatrix<double> MSpMat;
const MSpMat X(as<MSpMat>(XX_));
int i = as<int>(ii_)-1;
int j = as<int>(jj_)-1;
double ans = X.coeff(i,j);
return(wrap(ans));
'
Xiijj <- cxxfunction(signature(XX_="matrix", ii_="integer", jj_="integer"),
body=src, plugin="RcppEigen")
mm <- matrix(c(1,0,0,0,0,0,0,0), nr=100, nc=100)
MM <- as(mm, "Matrix")
object.size(mm)
object.size(MM)
lookup <- function(mat, func){
for (i in 1:nrow(mat)){
for (j in 1:ncol(mat)){
v<-func(mat,i,j)
}
}
}
benchmark(lookup(mm,`[`), lookup(MM,`[`), lookup(MM, Xiijj),
columns=c("test", "replications", "elapsed", "relative"), replications=5)
--------------------------------
-----Original Message-----
From: Duncan Murdoch [mailto:murdoch.duncan at gmail.com]
Sent: 25. juni 2012 11:27
To: Søren Højsgaard
Cc: r-help at r-project.org
Subject: Re: [R] Indexing matrices from the Matrix package with [i, j] seems to be very slow. Are there "faster alternatives"?
On 12-06-24 4:50 PM, Søren Højsgaard wrote:
> Dear all,
>
> Indexing matrices from the Matrix package with [i,j] seems to be very slow. For example:
>
> library(rbenchmark)
> library(Matrix)
> mm<- matrix(c(1,0,0,0,0,0,0,0), nr=20, nc=20)
> MM<- as(mm, "Matrix")
> lookup<- function(mat){
> for (i in 1:nrow(mat)){
> for (j in 1:ncol(mat)){
> mat[i,j]
> }
> }
> }
>
> benchmark(lookup(mm), lookup(MM), columns=c("test", "replications", "elapsed", "relative"), replications=50)
> test replications elapsed relative
> 1 lookup(mm) 50 0.01 1
> 2 lookup(MM) 50 8.77 877
>
> I would have expected a small overhead when indexing a matrix from the Matrix package, but this result is really surprising...
> Does anybody know if there are faster alternatives to [i,j] ?
There's also a large overhead when indexing a dataframe, though Matrix appears to be slower. It's designed to work on whole matrices at a time, not single entries. So I'd suggest that if you need to use [i,j] indexing, then try to arrange your code to localize the access, and extract a submatrix as a regular fast matrix first. (Or if it will fit in memory, convert the whole thing to a matrix just for the access. If I just add the line
mat <- as.matrix(mat)
at the start of your lookup function, it becomes several hundred times
faster.)
More information about the R-help
mailing list