[BioC] dbWriteTable Failure

Elliot Joel Bernstein elliot.bernstein at fdopartners.com
Fri Jan 14 15:25:03 CET 2011


Martin -

Thanks for your reply. Is RdbiPgSQL no longer maintained at all? I have tried RPostgreSQL, but it has an even worse problem. In that package, dbWriteTable is implemented by calling write.table on the data frame to write it to a temporary file, and then load it into the database, which means that the formatting options set in R affect the result. Specifically, write.table by default formats some numbers in scientific notation, which causes the database load to fail.

After looking into the problem with RdbiPgSQL a little more, it seems that it's caused by dbDisconnect calling the same function that is registered as the finalizer. I think it could be easily solved by using an indicator variable to keep track of whether PQfinish has been called. (I can also just call rm instead of dbDisconnect, but it's a little disconcerting that the package contains a function that causes memory errors when used as intended.)

- Elliot


On Tue, Jan 11, 2011 at 08:12:21PM -0800, Martin Morgan wrote:
> On 01/11/2011 11:37 AM, Elliot Joel Bernstein wrote:
> > I am having a problem using the dbWriteTable function in package RdbiPgSQL to insert data into a PostgreSQL table. Specifically, the following code sometimes works, but sometimes fails with an error message indicating that there is "no database connection":
> > 
> > 
> > 
> > =====================================================================
> > dbtest.R
> > =====================================================================
> > 
> > require(RdbiPgSQL)
> 
> Hi Elliot -- not so much an answer, but unless RdbiPgSQL is central you
> might try the CRAN package RPostgreSQL, which is actively maintained.
> 
> Martin
> 
> > 
> > test <- function(conn, n) {
> >   
> >   dbSendQuery(conn, "create temp table temp (x int, y int)")
> > 
> >   n <- 10000
> >   dat <- data.frame(x=sample(1000,n,replace=T), y=sample(1000,n,replace=T))
> >   
> >   for (i in 1:dim(dat)[1]) {
> > 
> >     err <- try(dbSendQuery(conn, sprintf("insert into temp values (%d,%d)", dat$x[i], dat$y[i])))
> >     if (inherits(err, "try-error")) {
> >       print(conn)
> >       break
> >     }
> >     
> >   }
> > 
> >   print(dbGetQuery(conn, "select count(1) from temp"))
> > 
> > }
> > 
> > conn <- dbConnect(PgSQL(), dbname="gf", host="gf-host")
> > test(conn, 10000)
> > dbDisconnect(conn)
> > 
> > =====================================================================
> > 
> > 
> > 
> > The following is sample output from multiple runs of this script:
> > 
> > 
> > 
> > =====================================================================
> > Sample output
> > =====================================================================
> > 
> > R version 2.12.0 (2010-10-15)
> > Copyright (C) 2010 The R Foundation for Statistical Computing
> > ISBN 3-900051-07-0
> > Platform: x86_64-redhat-linux-gnu (64-bit)
> > 
> > R is free software and comes with ABSOLUTELY NO WARRANTY.
> > You are welcome to redistribute it under certain conditions.
> > Type 'license()' or 'licence()' for distribution details.
> > 
> >   Natural language support but running in an English locale
> > 
> > R is a collaborative project with many contributors.
> > Type 'contributors()' for more information and
> > 'citation()' on how to cite R or R packages in publications.
> > 
> > Type 'demo()' for some demos, 'help()' for on-line help, or
> > 'help.start()' for an HTML browser interface to help.
> > Type 'q()' to quit R.
> > 
> >> source("dbtest.R")
> > Loading required package: RdbiPgSQL
> > Loading required package: Rdbi
> >   count
> > 1 10000
> >> source("dbtest.R")
> > Error in dbSendQuery.PgSQL.conn(conn, sprintf("insert into temp values (%d,%d)",  : 
> >   No database connection
> > status = 1 
> > database.name =  a 
> > host.name = p�� 
> > options = 8}G 
> > password = P� 
> > port = � 
> > tty = �� 
> > socket = -1 
> > client.encoding = -1 
> > backend.pid = 0 
> > Error in dbSendQuery.PgSQL.conn(conn, ...) : No database connection
> > 
> > =====================================================================
> > 
> > 
> > 
> > Here is the result of a traceback:
> > 
> > 
> > 
> > =====================================================================
> > traceback
> > =====================================================================
> > 
> >> traceback()
> > 11: .Call("PgSQLsendQuery", conn, query, PACKAGE = "RdbiPgSQL")
> > 10: dbSendQuery.PgSQL.conn(conn, ...) at dbtest.R#21
> > 9: dbSendQuery(conn, ...) at dbtest.R#21
> > 8: dbGetResult(dbSendQuery(conn, ...)) at dbtest.R#21
> > 7: dbGetQuery.PgSQL.conn(conn, "select count(1) from temp") at dbtest.R#21
> > 6: dbGetQuery(conn, "select count(1) from temp") at dbtest.R#21
> > 5: print(dbGetQuery(conn, "select count(1) from temp")) at dbtest.R#21
> > 4: test(conn, 10000)
> > 3: eval.with.vis(expr, envir, enclos)
> > 2: eval.with.vis(ei, envir)
> > 1: source("dbtest.R")
> > 
> > =====================================================================
> > 
> > 
> > 
> > And here is my sessionInfo:
> > 
> > 
> > 
> > =====================================================================
> > sessionInfo
> > =====================================================================
> > 
> >> sessionInfo()
> > R version 2.12.0 (2010-10-15)
> > Platform: x86_64-redhat-linux-gnu (64-bit)
> > 
> > locale:
> >  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
> >  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
> >  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8   
> >  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
> >  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
> > 
> > attached base packages:
> > [1] stats     graphics  grDevices utils     datasets  methods   base     
> > 
> > other attached packages:
> > [1] RdbiPgSQL_1.24.0 Rdbi_1.24.0     
> > 
> > =====================================================================
> > 
> > 
> > 
> > I am using version 8.4.4 of the PostgreSQL-devel libraries, and the server is running version 8.4.1. The problem does not always occur on the second run; I am sometimes able to run the script multiple times before it eventually fails. Also, it does not always fail at the same row of the dataframe, nor is there anything obviously wrong with the data values on the row where it fails. I would appreciate any help in diagnosing the problem.
> > 
> > Thank you very much.
> > 
> > - Elliot
> > 
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 
> -- 
> Computational Biology
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> 
> Location: M1-B861
> Telephone: 206 667-2793



More information about the Bioconductor mailing list