[R-sig-DB] concurrent reading/writing in "chunks" with RSQLite (need some help troubleshooting)

kMan kch@mber|n @end|ng |rom gm@||@com
Wed Jun 30 05:39:11 CEST 2010


I apologize. I sent this initially without correcting the subject line.

Sincerely,
KeithC.

-----Original Message-----
Dear Jonathan,

>I'm trying to cycle through a database, reading a chunk of data from 
>one
table in the DB, processing this data chunk, >and then writing the output to
another table in the database.  I'm having to set the fetch(n=) option
because the >database is too large to read the entire thing into memory at
once.  Here's my sample code:
[snip]

There is no need to open separate connections for reading vs. writing.

Your code was corrupting the connection between R/driver, I think, and I
don't know why. But when I close a connection in R and my OS won't let me
delete the file, there is a different kind of problem. The code below works
on my system. I wish I knew enough to explain wtf happened with your code.
------------------------------------------------------------------------
##
## Prelim/Declarations
##
require(RSQLite)
mysqldb <- "test.sqlite"
m <- dbDriver("SQLite")
tbl.names <- c("TABLEA", "TABLEB")
q1<-paste("SELECT COUNT(*) FROM", tbl.names[1])

##
## Surrogate data (if not allready done) ##
(somedata=data.frame(data1=seq(1:10),data2=seq(1:10)*3))
con<-dbConnect(m, dbname=mysqldb)
dbWriteTable(con,"TABLEA",somedata, overwrite=TRUE)
dbDisconnect(con)

##
## Process TABLEA in parts, write to TABLEB ## con<-dbConnect(m,
dbname=mysqldb) n.lines<-dbGetQuery(con, q1) #NOTE class is data.frame
chunk.size<-3
start.index<-seq(from = 1, to = n.lines[1,1], by = 3)
limit<-c(diff(start.index), n.lines[1,1]-sum(diff(start.index)))
i<-1
for(i in 1:length(limit)){
  if(i<length(limit)) { q2<-paste("SELECT * FROM TABLEA WHERE row_names >=",

      start.index[i], "LIMIT", limit[i])
  } else { q2<-paste("SELECT * FROM TABLEA WHERE row_names =", 
      start.index[i], "LIMIT", limit[i])
  }
  dat<-dbGetQuery(con, q2)
  dbWriteTable(con, tbl.names[2], dat, row.names=FALSE, append=TRUE) }
dbGetQuery(con, "SELECT * FROM TABLEB")
dbDisconnect(con)
------------------------------------------------------------------------

Sincerely,
KeithC.




More information about the R-sig-DB mailing list