[R] Memory problem?

Jay Emerson jayemerson at gmail.com
Thu Jan 31 14:31:07 CET 2008


Page 23 of the R Installation Guide provides some memory guidelines
that you might find helpful.

There are a few things you could try using R, at least to get up and running:

- Look at fewer tumors at a time using standard R as you have been.
- Look at the ff package, which leaves the data in flat files with
memory mapped pages.
- It may be that package filehash does something similar using a
database (I'm less familiar with this).
- Wait for the upcoming package bigmemoRy package, which is designed
to place large objects like this in RAM (using C++) but gives you a
close-to-seamless interaction with it from R.  Caveat below.

With any of these options, you are still very much restricted by the
type of analysis you are attempting.  Almost any existing procedure
(e.g. a cox model) would need a regular R object (probably impossible)
and you are back to square one.  An exception to this is Thomas
Lumley's biglm package, which processes the data in chunks.  We need
more tools like these.  Ultimately, you'll need to find some method of
analysis that is pretty smart memory-wise, and this may not be easy.

Best of luck,


Original message:

I am trying to run a cox model for the prediction of relapse of 80 cancer
tumors, taking into account the expression of 17000 genes. The data are
large and I retrieve an error:
"Cannot allocate vector of 2.4 Mb". I increase the memory.limit to 4000
(which is the largest supported by my computer) but I still retrieve the
error because of other big variables that I have in the workspace. Does
anyone know how to overcome this problem?

Many thanks in advance,

John W. Emerson (Jay)
Assistant Professor of Statistics
Director of Graduate Studies
Department of Statistics
Yale University
Statistical Consultant, REvolution Computing

More information about the R-help mailing list