[R] Working with massive matrices in R

svrieze vrie0006 at umn.edu
Mon Apr 18 22:10:19 CEST 2011


Hello,

I'm (eventually) attempting a singular value decomposition of a 3200 x
527829 matrix in R version 2.10.1.  The script is as follows:
###---------Begin Script here-------###
library(Matrix)

snps <- 527829                   ## Number of SNPs
N <- 3200                        ## Sample size
y <- rnorm(N, 100,1)               ## simulated phenotype
system.time(
## read in matrix 3200 x 527829
x <- scan("gedi7.raw", what=rep(0,snps), nmax=N*snps, skip=1))
system.time(x <- matrix(x,nrow=N,ncol=snps, byrow=TRUE))
print(object.size(x), units="Mb")
###--------End Script----------------####

The scan function finishes without a problem.  "x" is in double precision
floating point format and takes up 12886.5Mb of memory at the first
object.size() statement.

When I convert it to a matrix I get an error stating that I cannot allocate
a vector of size 12.6Gb.  I have requested 31Gb of memory on the server. 
12.6+ 12.8 = 25.4Gb of used memory.  Is it that R is using considerable
memory for operations not directly related to storing the matrix objects
here?  Or is this perhaps a problem of contiguous memory?

Any help is greatly appreciated.

-Scott



--
View this message in context: http://r.789695.n4.nabble.com/Working-with-massive-matrices-in-R-tp3458561p3458561.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list