[R] Memory problem on a linux cluster using a large data set [Broadcast]
Martin Morgan
mtmorgan at fhcrc.org
Thu Dec 21 18:07:01 CET 2006
Section 8 of the Installation and Administration guide says that on
64-bit architectures the 'size of a block of memory allocated is
limited to 2^32-1 (8 GB) bytes'.
The wording 'a block of memory' here is important, because this sets a
limit on a single allocation rather than the memory consumed by an R
session. The size of the allocation of the original poster was
something like 300,000 SNPs x 1000 individuals x 8 bytes (depending on
representation, I guess) = about 2.3 GB so there is still some room
for even larger data.
Obviously it's important to think carefully about how the statistical
analysis of such a large volume of data will proceed, and be
interpreted.
Martin
Thomas Lumley <tlumley at u.washington.edu> writes:
> On Thu, 21 Dec 2006, Iris Kolder wrote:
>
>> Thank you all for your help!
>>
>> So with all your suggestions we will try to run it on a computer with a
>> 64 bits proccesor. But i've been told that the new R versions all work
>> on a 32bits processor. I read in other posts that only the old R
>> versions were capable of larger data sets and were running under 64 bit
>> proccesors. I also read that they are adapting the new R version for 64
>> bits proccesors again so does anyone now if there is a version available
>> that we could use?
>
> Huh? R 2.4.x runs perfectly happily accessing large memory under Linux on
> 64bit processors (and Solaris, and probably others). I think it even works
> on Mac OS X now.
>
> For example:
>> x<-rnorm(1e9)
>> gc()
> used (Mb) gc trigger (Mb) max used (Mb)
> Ncells 222881 12.0 467875 25.0 350000 18.7
> Vcells 1000115046 7630.3 1000475743 7633.1 1000115558 7630.3
>
>
> -thomas
>
> Thomas Lumley Assoc. Professor, Biostatistics
> tlumley at u.washington.edu University of Washington, Seattle
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Martin T. Morgan
Bioconductor / Computational Biology
http://bioconductor.org
More information about the R-help
mailing list