[R] database table merging tips with R
m_olshansky at yahoo.com
Fri Sep 12 00:59:19 CEST 2008
One possibility is as follows:
If r$userid is your array of (2000) ID's then
s <- paste(r$userid,sep=",")
s<- paste("select t.userid, x, y, z from largetable t where t.serid in (",s,")",sep="")
d <- sqlQuery(connection,s)
--- On Fri, 12/9/08, Avram Aelony <aavram at mac.com> wrote:
> From: Avram Aelony <aavram at mac.com>
> Subject: [R] database table merging tips with R
> To: r-help at stat.math.ethz.ch
> Received: Friday, 12 September, 2008, 4:33 AM
> Dear R list,
> What is the best way to efficiently marry an R dataset with
> a very large (Oracle) database table?
> The goal is to only return Oracle table rows that match IDs
> present in the R dataset.
> I have an R data frame with 2000 user IDs analogous to: r =
> ...and I need to pull data from an Oracle table only for
> these 2000 IDs. The Oracle table is quite large.
> Additionally, the sql query may need to join to other tables
> to bring in ancillary fields.
> I currently connect to Oracle via odbc:
> connection <- odbcConnect("****",
> uid="****", pwd="****")
> d = sqlQuery(connection, "select userid, x, y, z from
> largetable where timestamp > sysdate -7")
> ...allowing me to pull data from the database table into
> the R object "d" and then use the R merge
> function. The problem however is that if "d" is
> too large it may fail due to memory limitations or be
> inefficient. I would like to push the merge portion to the
> database and it would be very convenient if it were possible
> to request that the query look to the R object for the
> ID's to which it should restrict the output.
> Is there a way to do this?
> Something like the following fictional code:
> d = sqlQuery(connection, "select t.userid, x, y, z
> from largetable t where r$userid=t.userid")
> Would sqldf (http://code.google.com/p/sqldf/) help me out
> here? If so, how? This would be convenient and help me
> avoid needing to create a temporary table to store the R
> data, join via sql, then return the data back to R.
> I am using R version 2.7.2 (2008-08-25) / i386-pc-mingw32 .
> Thanks for your comments, ideas, recommendations.
> R-help at r-project.org mailing list
> PLEASE do read the posting guide
> and provide commented, minimal, self-contained,
> reproducible code.
More information about the R-help