[R-sig-finance] R vs. S-PLUS vs. SAS
Andrew Piskorski
atp at piskorski.com
Sat Dec 4 13:15:40 CET 2004
On Fri, Dec 03, 2004 at 06:37:15PM +0000, Patrick Burns wrote:
> There may be some differences between SAS procedures, but
> at least generally SAS does not require the whole data to be in
> RAM. Regression will take the data row by row and do an update
> for the answer.
Someone might want to ask Joe Conway about his experience and thoughts
integrating R as a procedural language inside PostgreSQL, to create
PL/R:
http://www.joeconway.com/plr/
http://gborg.postgresql.org/project/plr/projdisplay.php
(Hm, for good measure, I have Cc'd him on this email.) Obviously, an
RDBMS like PostgreSQL is expert at dealing with data that doesn't fit
into RAM. I've no idea whether PL/R does anything special to take
advantage of that, or how feasible it would be to do so.
Does anyone here know much about what makes R dependent on all data
being in RAM, or of links to same? Is it just some centralized
low-level bits, or do broad swaths of code and algorithms all depend
on the in-RAM assumption?
How do SAS and other such systems avoid that? Do they do this better
or much more more transparently than what an R user would do now
manually? Where by "manually", I mean, query some fits-in-RAM amount
data out of an RDBMS (or other such on-disk store), analyze it, delete
the data to free up RAM, and repeat.
Could one say, tie a light-weight high-performance RDBMS library, like
SQLite, into R, and have R use it profitably to scale nicely on data
that does not fit in RAM? In what way, if any, would this offer a
substantial advantage over current manual R-plus-RDBMS practice?
--
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/
More information about the R-sig-finance
mailing list