[R] crossprod(x) vs crossprod(x,x)
Robin Hankin
r.hankin at noc.soton.ac.uk
Tue Nov 21 12:39:04 CET 2006
Professor Ripley
thanks for this.
I see what you say about gc(). I'll try your looped test on my system:
> x <- matrix(rnorm(3000000),ncol=3)
>
> system.time(for(i in 1:100){crossprod(x)})
[1] 7.931 20.420 28.619 0.000 0.000
> system.time(for(i in 1:100){crossprod(x)})
[1] 7.975 20.590 29.512 0.000 0.000
> system.time(for(i in 1:100){crossprod(x)})
[1] 8.003 20.641 30.160 0.000 0.000
> system.time(for(i in 1:100){crossprod(x,x)})
[1] 2.350 0.032 2.552 0.000 0.000
> system.time(for(i in 1:100){crossprod(x,x)})
[1] 2.330 0.015 2.333 0.000 0.000
> system.time(for(i in 1:100){crossprod(x,x)})
[1] 2.333 0.034 2.447 0.000 0.000
>
How come my findings are qualitatively different from yours?
> Sys.info()
sysname
"Darwin"
release
"8.8.0"
version
"Darwin Kernel Version 8.8.0: Fri Sep 8 17:18:57 PDT 2006;
root:xnu-792.12.6.obj~1/RELEASE_PPC"
nodename
"octopus.noc.soton.ac.uk"
machine
"Power Macintosh"
login
"rksh"
user
"rksh"
>
On 21 Nov 2006, at 11:17, Prof Brian Ripley wrote:
> On Tue, 21 Nov 2006, Robin Hankin wrote:
>
>> I found out the other day that crossprod() will take a single
>> matrix argument;
>> crossprod(x) notionally returns crossprod(x,x).
>
> It actually says
>
> x, y: matrices: 'y = NULL' is taken to be the same matrix as 'x'.
>
> but not that it is the same as crossprod(x,x).
>
>> The two forms do not return identical matrices:
>>
>> x <- matrix(rnorm(3000000),ncol=3)
>> M1 <- crossprod(x)
>> M2 <- crossprod(x,x)
>>
>> R> max(abs(M1-M2))
>> [1] 1.932494e-08
>>
>> But what really surprised me is that crossprod(x) is slower than
>> crossprod(x,x):
>>
>> R> system.time(crossprod(x))
>> [1] 0.079 0.206 0.292 0.000 0.000
>> R> system.time(crossprod(x,x))
>> [1] 0.035 0.001 0.041 0.000 0.000
>
> That's not usually the case: your example is too small to be
> reliable, and the large system time is suspicious. I get (in R-
> devel, R's internal BLAS)
>
>> system.time(crossprod(x))
> user system elapsed
> 0.034 0.000 0.034
>> system.time(crossprod(x,x))
> user system elapsed
> 0.044 0.001 0.044
>
> Such small times are subject to a lot of unrepeatability (they
> depend on when gc()s happen and hence on the current state of the
> gc tuning). I suggest you try running repeats for a for loop of
> 100, e.g.
>
>> system.time(for(i in 1:100) crossprod(x))
> user system elapsed
> 3.602 0.004 4.895
>> system.time(for(i in 1:100) crossprod(x))
> user system elapsed
> 3.612 0.015 3.984
>> system.time(for(i in 1:100) crossprod(x))
> user system elapsed
> 3.514 0.009 4.727
>> system.time(for(i in 1:100) crossprod(x,x))
> user system elapsed
> 5.636 0.013 8.963
>> system.time(for(i in 1:100) crossprod(x,x))
> user system elapsed
> 5.255 0.011 10.961
>
> (on a heavily used dual-processor Opteron system).
>
> For a more realistic example
>
>> x <- matrix(rnorm(3000000),ncol=300)
>> system.time(crossprod(x))
> user system elapsed
> 2.196 0.001 2.197
>> system.time(crossprod(x,x))
> user system elapsed
> 4.609 0.020 8.972
>
> or using an optimized BLAS:
>
>> x <- matrix(rnorm(3000000),ncol=300)
>> system.time(crossprod(x))
> user system elapsed
> 0.398 0.011 0.495
>> system.time(crossprod(x,x))
> user system elapsed
> 0.454 0.004 0.479
>
> --
> Brian D. Ripley, ripley at stats.ox.ac.uk
> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
> University of Oxford, Tel: +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UK Fax: +44 1865 272595
--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
tel 023-8059-7743
More information about the R-help
mailing list