[R] Correlation of huge matrix saved as binary file

Thomas Lumley tlumley at uw.edu
Sat Mar 3 23:12:09 CET 2012


On Sat, Mar 3, 2012 at 2:36 PM, Peter Langfelder
<peter.langfelder at gmail.com> wrote:

> 3. Instead of calculating the correlations one-by-one, calculate them
> in small blocks (if you have enough memory and you run a 64-bit R).
> With 900M rows, you will only be able to put a 900Mx2 into an R
> object, but if you have two such standardized matrices loaded in g1,
> g2, you can get their (2x2) correlation matrix by t(g1) %*% g2. This
> 2x2 matrix you can use to fill the appropriate components of the
> result matrix.

Or split it the other way.   Compute the covariance of all 9000
variables on, say, 50k observations and store it. Repeat 180 times,
then add up the covariances and scale to a correlation.

    -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland



More information about the R-help mailing list