[R-sig-DB] R-sig-DB Digest, Vol 68, Issue 7

Seth Falcon @eth @end|ng |rom u@erpr|m@ry@net
Wed Jun 30 05:36:46 CEST 2010


On Tue, Jun 29, 2010 at 8:19 PM, kMan <kchamberln using gmail.com> wrote:
> Dear Jonathan,
>
>>I'm trying to cycle through a database, reading a chunk of data from one
> table in the DB, processing this data chunk, >and then writing the output to
> another table in the database.  I'm having to set the fetch(n=) option
> because the >database is too large to read the entire thing into memory at
> once.  Here's my sample code:
> [snip]
>
> There is no need to open separate connections for reading vs. writing.

While it is true that you can read and write from a single connection,
you can only have one result set open at a time and so if you want to
iterate through a result set and then write, you do, I believe, need
two connections.


I suspect that the code you propose will be considerably slower as you
are issuing multiple queries along with a WHERE constraint.

+ seth


>
> Your code was corrupting the connection between R/driver, I think, and I
> don't know why. But when I close a connection in R and my OS won't let me
> delete the file, there is a different kind of problem. The code below works
> on my system. I wish I knew enough to explain wtf happened with your code.
> ------------------------------------------------------------------------
> ##
> ## Prelim/Declarations
> ##
> require(RSQLite)
> mysqldb <- "test.sqlite"
> m <- dbDriver("SQLite")
> tbl.names <- c("TABLEA", "TABLEB")
> q1<-paste("SELECT COUNT(*) FROM", tbl.names[1])
>
> ##
> ## Surrogate data (if not allready done)
> ##
> (somedata=data.frame(data1=seq(1:10),data2=seq(1:10)*3))
> con<-dbConnect(m, dbname=mysqldb)
> dbWriteTable(con,"TABLEA",somedata, overwrite=TRUE)
> dbDisconnect(con)
>
> ##
> ## Process TABLEA in parts, write to TABLEB
> ##
> con<-dbConnect(m, dbname=mysqldb)
> n.lines<-dbGetQuery(con, q1) #NOTE class is data.frame
> chunk.size<-3
> start.index<-seq(from = 1, to = n.lines[1,1], by = 3)
> limit<-c(diff(start.index), n.lines[1,1]-sum(diff(start.index)))
> i<-1
> for(i in 1:length(limit)){
>  if(i<length(limit)) { q2<-paste("SELECT * FROM TABLEA WHERE row_names >=",
>
>      start.index[i], "LIMIT", limit[i])
>  } else { q2<-paste("SELECT * FROM TABLEA WHERE row_names =",
>      start.index[i], "LIMIT", limit[i])
>  }
>  dat<-dbGetQuery(con, q2)
>  dbWriteTable(con, tbl.names[2], dat, row.names=FALSE, append=TRUE)
> }
> dbGetQuery(con, "SELECT * FROM TABLEB")
> dbDisconnect(con)
> ------------------------------------------------------------------------
>
> Sincerely,
> KeithC.
>
>
>



-- 
Seth Falcon | @sfalcon | http://userprimary.net/




More information about the R-sig-DB mailing list