[R] R does in memory analysis only?

David Smith dsmith at insightful.com
Mon Feb 9 19:32:05 CET 2004


Ross Boylan writes:
> R works only on problems that fit into (real or virtual) memory.
> ... does S-Plus have the same limitation?

S-PLUS, like R, does its computations in-memory. So you're limited to solving
problems which can fit in the available RAM (plus available swap space).  The
OS may impose additional limits (e.g. 2GB of total address space on many
Windows systems).

However, Insightful Miner, which works with S-PLUS, does include algorithms
which can process data sets out of memory. This includes the ability to
perform regressions on data sets much larger than the available RAM (the only
limit is the availability of disk space to store the results).  You can also
link S-PLUS with Insightful to perform out-of-memory calculations using
S-PLUS functions.  This works especially well with operations like predicting
from a model, which can be performed on a row-by-row basis.

I wrote a long discussion about in-memory and out-of-memory algorithms in the
context of S-PLUS and Insightful Miner, which you can download from:

http://www.insightful.com/support/whitepaper_download.asp

# David Smith

-- 
David M Smith <dsmith at insightful.com>
Product Manager, Insightful Corp, Seattle WA
Tel: +1 (206) 802 2360
Fax: +1 (206) 283 6310

New Insightful Miner 3! Discover how Pfizer, Bank of America and others are
using Insightful Miner -- a highly scalable data analysis workbench. Learn
more at http://www.insightful.com/products/iminer

> -----Original Message-----
> From: Ross Boylan [mailto:ross at biostat.ucsf.edu]
> Sent: Saturday, February 07, 2004 2:16 PM
> To: r-help
> Subject: [R] R does in memory analysis only?
> 
> 
> I wonder if someone would confirm something I'm 99% sure of from the
> docs and discussion on the list, but can't find stated explicitly:
> R works only on problems that fit into (real or virtual) memory.
> 
> Thus, even if you have a problem (e.g., simple regression) 
> that could be
> solved by doing some operation on each row of a dataset at a time, you
> can't solve it unless the entire dataset and associated intermediate
> results fit in memory.
> 
> So if you're in 32 bits, your max problem size is about 2G (regular
> Windows and Linux limit your process size to this, though I think some
> fancy versions let you go a bit higher).
> 
> Is there any thought of relaxing this limitation?  I realize doing so
> would be a big job.  I also realize that 64 bits makes it much less
> pressing.
> 
> Finally, does S-Plus have the same limitation?
> 
> Thanks.
> -- 
> Ross Boylan                                      wk:  (415) 502-4031
> 530 Parnassus Avenue (Library) rm 115-4          ross at biostat.ucsf.edu
> Dept of Epidemiology and Biostatistics           fax: (415) 476-9856
> University of California, San Francisco
> San Francisco, CA 94143-0840                     hm:  (415) 550-1062
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html




More information about the R-help mailing list