[R] crossprod(x) vs crossprod(x,x)

Robin Hankin r.hankin at noc.soton.ac.uk
Tue Nov 21 12:39:04 CET 2006


Professor Ripley

thanks for this.

I see what you say about gc().  I'll try your looped test on my system:



 > x <- matrix(rnorm(3000000),ncol=3)
 >
 > system.time(for(i in 1:100){crossprod(x)})
[1]  7.931 20.420 28.619  0.000  0.000
 > system.time(for(i in 1:100){crossprod(x)})
[1]  7.975 20.590 29.512  0.000  0.000
 > system.time(for(i in 1:100){crossprod(x)})
[1]  8.003 20.641 30.160  0.000  0.000
 > system.time(for(i in 1:100){crossprod(x,x)})
[1] 2.350 0.032 2.552 0.000 0.000
 > system.time(for(i in 1:100){crossprod(x,x)})
[1] 2.330 0.015 2.333 0.000 0.000
 > system.time(for(i in 1:100){crossprod(x,x)})
[1] 2.333 0.034 2.447 0.000 0.000
 >



How come my findings are qualitatively different from yours?







 > Sys.info()
                                                                         
                   sysname
                                                                         
                  "Darwin"
                                                                         
                   release
                                                                         
                   "8.8.0"
                                                                         
                   version
"Darwin Kernel Version 8.8.0: Fri Sep  8 17:18:57 PDT 2006;  
root:xnu-792.12.6.obj~1/RELEASE_PPC"
                                                                         
                  nodename
                                                                         
"octopus.noc.soton.ac.uk"
                                                                         
                   machine
                                                                         
         "Power Macintosh"
                                                                         
                     login
                                                                         
                    "rksh"
                                                                         
                      user
                                                                         
                    "rksh"
 >









On 21 Nov 2006, at 11:17, Prof Brian Ripley wrote:

> On Tue, 21 Nov 2006, Robin Hankin wrote:
>
>> I found out the other day that crossprod() will take a single  
>> matrix argument;
>> crossprod(x)  notionally returns crossprod(x,x).
>
> It actually says
>
>     x, y: matrices: 'y = NULL' is taken to be the same matrix as 'x'.
>
> but not that it is the same as crossprod(x,x).
>
>> The two forms do not  return identical matrices:
>>
>> x <- matrix(rnorm(3000000),ncol=3)
>> M1 <- crossprod(x)
>> M2 <- crossprod(x,x)
>>
>> R> max(abs(M1-M2))
>> [1] 1.932494e-08
>>
>> But what really surprised me is that crossprod(x) is slower than
>> crossprod(x,x):
>>
>> R> system.time(crossprod(x))
>> [1] 0.079 0.206 0.292 0.000 0.000
>> R> system.time(crossprod(x,x))
>> [1] 0.035 0.001 0.041 0.000 0.000
>
> That's not usually the case: your example is too small to be  
> reliable, and the large system time is suspicious.  I get (in R- 
> devel, R's internal BLAS)
>
>> system.time(crossprod(x))
>    user  system elapsed
>   0.034   0.000   0.034
>> system.time(crossprod(x,x))
>    user  system elapsed
>   0.044   0.001   0.044
>
> Such small times are subject to a lot of unrepeatability (they  
> depend on when gc()s happen and hence on the current state of the  
> gc tuning).  I suggest you try running repeats for a for loop of  
> 100, e.g.
>
>> system.time(for(i in 1:100) crossprod(x))
>    user  system elapsed
>   3.602   0.004   4.895
>> system.time(for(i in 1:100) crossprod(x))
>    user  system elapsed
>   3.612   0.015   3.984
>> system.time(for(i in 1:100) crossprod(x))
>    user  system elapsed
>   3.514   0.009   4.727
>> system.time(for(i in 1:100) crossprod(x,x))
>    user  system elapsed
>   5.636   0.013   8.963
>> system.time(for(i in 1:100) crossprod(x,x))
>    user  system elapsed
>   5.255   0.011  10.961
>
> (on a heavily used dual-processor Opteron system).
>
> For a more realistic example
>
>> x <- matrix(rnorm(3000000),ncol=300)
>> system.time(crossprod(x))
>    user  system elapsed
>   2.196   0.001   2.197
>> system.time(crossprod(x,x))
>    user  system elapsed
>   4.609   0.020   8.972
>
> or using an optimized BLAS:
>
>> x <- matrix(rnorm(3000000),ncol=300)
>> system.time(crossprod(x))
>    user  system elapsed
>   0.398   0.011   0.495
>> system.time(crossprod(x,x))
>    user  system elapsed
>   0.454   0.004   0.479
>
> -- 
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595

--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743



More information about the R-help mailing list