[R-sig-DB] Is any database particularly better at "exchanging" large datasets with R?

Sean Davis @d@v|@2 @end|ng |rom m@||@n|h@gov
Wed Feb 6 22:13:47 CET 2008


On Feb 6, 2008 2:10 PM, Thomas Pujol <thomas.pujol using yahoo.com> wrote:
> Is any database particularly better at "exchanging" data with R?
>
> Background:
> Sometime during the next 12-months, I plan on configuring a new computer system on which I will primarily run "R" and a SQL database (Microsoft SQL Server, MySQL, Oracle, etc).  My primary goal is to "optimize" the system for R, and for passing data to and from R and the database.
>
> I work with large datasets, and therefore I "think" one of my most important goals should be to maximize the amount of RAM that R can utilize effectively.
>
> I am seeking advice concerning the database, version of R, OS,  processor, hard-drive/storage configuration, etc. that I should consider. (I am guessing that I should build a system with lots of RAM, and a Linux OS, but am seeking advice from the R community.) If I choose Linux, does it matter which version I use? Any opinion regarding  implementing a commercially supported version from a vendor such as Red Hat, Sun, etc? Is any database particularly better at "exchanging" data with R?
>
> While cost is of course a consideration, it is probably a secondary consideration to overall performance, reliability, and ease of ongoing maintenance/support.

Hi, Thomas.

As for database, you'll probably need to be more specific about what
you want to do.  Oracle, MySQL, and Postgresql (at least) have
packages that support their use from R.  Other databases can be
configured to use RODBC.  From the database point of view, Postgresql
allows one to embed an R interpreter into the database.  As for
hardware requirements, that will depend on your application, so again,
you will probably need to be more specific.

Sean




More information about the R-sig-DB mailing list