[R] SQL statements (directly) in R

christian.ritter at shell.com christian.ritter at shell.com
Mon Feb 12 09:30:59 CET 2007


Hi R-users,

This note will interest people who would like to use sql statements on R data frames (a bit like proc sql in SAS). Please reply to my only, unless you really want to keep the entire R-help list posted on this. 

I've been thinking about a packgage implementing sql queries in R. I'm almost about starting to write it in a very rudimentary version. What I have in mind is the following:

Work in two ways:
via a generic sql("..") wrapper which allows a generic query statement
and via convenience functions, such as SELECT("..."), ...
what would be needed is an "sqlTable" class extending the data frame. This class will have to have extra slots for indices and some other stuff. I would try to stay very basic in the beginning and also use relatively inefficient handling of the tables. Later-on, direct calls using the binary representations could  replace the high level handling. 

Now come my questions:
- have others started working on this?
- are others interested in this?
- ideas on how to go about it?

Chris

P.S.: 
Here are a few ideas I was thinking about
One way would be to incorporate a gpl or lgpl rdbms into the package, to push the data-frame to it, to execute the statement there and to get the result back. The advantage: fast to implement. The disadvantage: pushing the data is a bad idea (but then again, at the top level, R will make a copy of it anyway, most probably). The convenience wrappers would then construct sql statements and the db engine would evaluate them. 

The other idea is to stay in R and to link the wrappers to adequately composed calls to subset, cbind, rbind, etc. Here it would be more challenging to create the sql("..") interface since its string would have to be parsed. 

The political incorrect thing about these SQL functions is that they (UPDATE, INSERT) will have to modify objects within the function call. They would not work via the return object. 

As I said, comments welcome.



More information about the R-help mailing list