[R] database table merging tips with R

Moshe Olshansky m_olshansky at yahoo.com
Fri Sep 12 00:59:19 CEST 2008


One possibility is as follows:

If r$userid is your array of (2000) ID's then
s <- paste(r$userid,sep=",")
s<- paste("select t.userid, x, y, z from largetable t where t.serid in (",s,")",sep="")
and finally
d <- sqlQuery(connection,s)

Regards,

Moshe.


--- On Fri, 12/9/08, Avram Aelony <aavram at mac.com> wrote:

> From: Avram Aelony <aavram at mac.com>
> Subject: [R] database table merging tips with R
> To: r-help at stat.math.ethz.ch
> Received: Friday, 12 September, 2008, 4:33 AM
> Dear R list,
> 
> What is the best way to efficiently marry an R dataset with
> a very large (Oracle) database table?  
> 
> The goal is to only return Oracle table rows that match IDs
> present in the R dataset.  
> I have an R data frame with 2000 user IDs analogous to: r =
> data.frame(userid=round(runif(2000)*100000,0))
> 
> ...and I need to pull data from an Oracle table only for
> these 2000 IDs.  The Oracle table is quite large.
> Additionally, the sql query may need to join to other tables
> to bring in ancillary fields.
> 
> I currently connect to Oracle via odbc: 
> 
> library(RODBC)
> connection <- odbcConnect("****",
> uid="****", pwd="****")
> d = sqlQuery(connection, "select userid, x, y, z from
> largetable where timestamp > sysdate -7")
> 
> ...allowing me to pull data from the database table into
> the R object "d" and then use the R merge
> function.  The problem however is that if "d" is
> too large it may fail due to memory limitations or be
> inefficient.  I would like to push the merge portion to the
> database and it would be very convenient if it were possible
> to request that the query look to the R object for the
> ID's to which it should restrict the output.  
> 
> Is there a way to do this?
> Something like the following fictional code:
> d = sqlQuery(connection, "select t.userid, x, y, z
> from largetable t where r$userid=t.userid")
> 
> Would sqldf (http://code.google.com/p/sqldf/) help me out
> here? If so, how?   This would be convenient and help me
> avoid needing to create a temporary table to store the R
> data, join via sql, then return the data back to R.
> 
> I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
> 
> Thanks for your comments, ideas, recommendations.
> 
> 
> -Avram
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.



More information about the R-help mailing list