[R] crossprod(x) vs crossprod(x,x)

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Nov 21 12:17:04 CET 2006


On Tue, 21 Nov 2006, Robin Hankin wrote:

> I found out the other day that crossprod() will take a single matrix argument;
> crossprod(x)  notionally returns crossprod(x,x).

It actually says

     x, y: matrices: 'y = NULL' is taken to be the same matrix as 'x'.

but not that it is the same as crossprod(x,x).

> The two forms do not  return identical matrices:
>
> x <- matrix(rnorm(3000000),ncol=3)
> M1 <- crossprod(x)
> M2 <- crossprod(x,x)
>
> R> max(abs(M1-M2))
> [1] 1.932494e-08
>
> But what really surprised me is that crossprod(x) is slower than
> crossprod(x,x):
>
> R> system.time(crossprod(x))
> [1] 0.079 0.206 0.292 0.000 0.000
> R> system.time(crossprod(x,x))
> [1] 0.035 0.001 0.041 0.000 0.000

That's not usually the case: your example is too small to be reliable, and 
the large system time is suspicious.  I get (in R-devel, R's internal 
BLAS)

> system.time(crossprod(x))
    user  system elapsed
   0.034   0.000   0.034
> system.time(crossprod(x,x))
    user  system elapsed
   0.044   0.001   0.044

Such small times are subject to a lot of unrepeatability (they depend on 
when gc()s happen and hence on the current state of the gc tuning).  I 
suggest you try running repeats for a for loop of 100, e.g.

> system.time(for(i in 1:100) crossprod(x))
    user  system elapsed
   3.602   0.004   4.895
> system.time(for(i in 1:100) crossprod(x))
    user  system elapsed
   3.612   0.015   3.984
> system.time(for(i in 1:100) crossprod(x))
    user  system elapsed
   3.514   0.009   4.727
> system.time(for(i in 1:100) crossprod(x,x))
    user  system elapsed
   5.636   0.013   8.963
> system.time(for(i in 1:100) crossprod(x,x))
    user  system elapsed
   5.255   0.011  10.961

(on a heavily used dual-processor Opteron system).

For a more realistic example

> x <- matrix(rnorm(3000000),ncol=300)
> system.time(crossprod(x))
    user  system elapsed
   2.196   0.001   2.197
> system.time(crossprod(x,x))
    user  system elapsed
   4.609   0.020   8.972

or using an optimized BLAS:

> x <- matrix(rnorm(3000000),ncol=300)
> system.time(crossprod(x))
    user  system elapsed
   0.398   0.011   0.495
> system.time(crossprod(x,x))
    user  system elapsed
   0.454   0.004   0.479

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list