[R-sig-DB] Storing R objects (was [R] advice requested re: building "good" system (R, SQL db) for handling large datasets)

Richard Pearson r|ch@rd@pe@r@on @end|ng |rom po@tgr@d@m@nche@ter@@c@uk
Thu Feb 7 13:16:17 CET 2008


(moved to R-sig-db from R-help)

Jeff,

I have a project where I want to create large numbers of large, complex 
objects (e.g. bioconductor ExpressionSet objects). I want to store these 
along with metadata (such as what raw data and parameters were used to 
create the object). I will later want to access subsets of these 
objects, with the subset specified by a query. It seems to me the 
natural way to do this would be to store the metadata and the objects 
themselves in database tables, and I have assumed that the objects would 
need to be serialised and stored as BLOBs. It sounds like at present 
there are no plans for infrastructure that would allow me to do this, 
but I would be interested to know if anyone plans to make such a 
scenario possible in the future.

I am assuming in the above that it is not possible to store arbitrarily 
complex R objects in a DB, without a lot of work coercing all the 
various slots in the object to data.frames, and saving the data.frames 
to different tables. I've had a quick scan through the documentation for 
DBI, RODBC, RMySQL and ROracle, but couldn't see any such functionality.

An alternative for my situation would be to store the R objects as files 
(using save) and store the metadata and filenames in a DB, but this 
seems to me to add an extra layer of complexity/maintenance. Finally, I 
could of course save everything as files, but one of the reasons for 
storing things in a DB is because I would like to create dynamic web 
pages linked to metadata and results data in the DB.

Best wishes

Richard.


Jeffrey Horner wrote:
> Richard Pearson wrote on 02/06/2008 06:25 AM:
>> Hi Thomas
> [...]
>> With databases, one issue that might be relevant is whether you want 
>> to store data in tables (e.g. one table to store one data.frame) that 
>> can subsequently be manipulated in the DB, or to store R objects as R 
>> objects (e.g. as BLOBs). My situation is likely to be the later case, 
>> and one of my concerns is that many DBs have an upper limit of 2GB on 
>> BLOBs, and I might potentially have objects that are larger than this.
> [...]
>
> I'd be curious as to why you'd want to store and retrieve R objects 
> from a BLOB column in a table. I've often thought about this, but 
> unfortunately neither the DBI package nor the RODBC package support this.
>
> Jeff




More information about the R-sig-DB mailing list