[R] [ExternalEmail] Pearson Correlation Speed

Nathan S. Watson-Haigh nathan.watson-haigh at csiro.au
Mon Dec 15 07:02:56 CET 2008

Nathan S. Watson-Haigh wrote:
> I'm trying to calculate Pearson correlation coefficients for a large
> matrix of size 18563 x 18563. The following function takes about XX
> minutes to complete, and I'd like to do this calculation about 15 times
> and so speed is some what of an issue.

Sorry, meant to fill in the blanks!

the following takes about 15 mins to complete:
corr <- abs(cor(dat, use="p"))

>
> Does anyone have any suggestions on ways to speed this up? I'd wondered
> if using C++ code to do the calculations might speed things up, but I've
> never written any C/C++ code or attempted to use any within R.
>
> I've seen some C++ code here:
> http://www.alglib.net/statistics/correlation.php
>
> I wondered if anyone might be able to help me get this so it can run in
> R? I've tried the following:
> http://www.alglib.net/translator/dl/statistics.correlation.cpp.zip
> 2) moved the contents of the libs dir into the parent dir alongside
> correlation.cpp (didn't know how to tell R where to look for C libraries)
> 3) Tried: "R CMD SHLIB correlation.cpp" and got the following as output:
> -- start output --
> icpc -I/tools/R/2.7.1/lib/R/include  -I/usr/local/include  -mp  -fpic
> -g -O2 -c correlation.cpp -o correlation.o
> ap.h(163): warning #858: type qualifier on return type is meaningless
>   const bool operator==(const complex& lhs, const complex& rhs);
>              ^
>
> ap.h(164): warning #858: type qualifier on return type is meaningless
>   const bool operator!=(const complex& lhs, const complex& rhs);
>              ^
>
> ap.h(179): warning #858: type qualifier on return type is meaningless
>   const double abscomplex(const complex &z);
>                ^
>
> icpc -shared -L/usr/local/lib -o correlation.so correlation.o
> -- end output --
> 4) Now this doesn't look brilliant! Any thoughts? Also, I'm assuming I
> need to do some other work with the C++ code in order to allow me to use
> it from within my R scripts - any pointers on that?
>
> Thanks for any input - I hope I just need a hand over the initial
> hurdles and then I can get onto that up-hill learning curve!!
>
> Nathan
>
>

