[R] crossprod vs %*% timing

Wed Oct 6 10:09:04 CEST 2004

Hi

the manpage says that crossprod(x,y) is formally equivalent to, but
faster than, the call 't(x) %*% y'.

I have a vector 'a' and a matrix 'A', and need to evaluate 't(a) %*% A
%*% a' many many times, and performance is becoming crucial.  With

f1 <- function(a,X){ ignore <- t(a) %*% X %*% a               }
f2 <- function(a,X){ ignore <- crossprod(t(crossprod(a,X)),a) }
f3 <- function(a,X){ ignore <- crossprod(a,X) %*% a           }

a <- rnorm(100)
X <- matrix(rnorm(10000),100,100)

print(system.time( for(i in 1:10000){ f1(a,X)}))
print(system.time( for(i in 1:10000){ f2(a,X)}))
print(system.time( for(i in 1:10000){ f3(a,X)}))

I get something like:

[1] 2.68 0.05 2.66 0.00 0.00
[1] 0.48 0.00 0.49 0.00 0.00
[1] 0.29 0.00 0.31 0.00 0.00

with quite low variability from run to run.  What surprises me is the
third figure: about 40% faster than the second one, the extra time
possibly related to the call to t() (and Rprof shows about 35% of
total time in t() for my application).

So it looks like f3() is the winner hands down, at least for this
task.  What is a good way of thinking about such issues?  Anyone got
any performance tips?

I quite often need things like 'a %*% X %*% t(Y) %*% Z %*% t(b)' which
would be something like
crossprod(t(crossprod(t(crossprod(t(crossprod(a,X)),t(Y))),Z)),t(b))
(I think).

(R-1.9.1, 2GHz G5 PowerPC, MacOSX10.3.5)

-- 
Robin Hankin
Uncertainty Analyst
Southampton Oceanography Centre
SO14 3ZH
tel +44(0)23-8059-7743
initialDOTsurname at soc.soton.ac.uk (edit in obvious way; spam precaution)