[BioC] dbWriteTable Failure

Dirk Eddelbuettel edd at debian.org
Fri Jan 14 18:56:47 CET 2011


On 14 January 2011 at 12:46, Sean Davis wrote:
| 
| 
| On Fri, Jan 14, 2011 at 12:36 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
| 
|     On 01/14/2011 06:25 AM, Elliot Joel Bernstein wrote:
|     > Martin -
|     >
|     > Thanks for your reply. Is RdbiPgSQL no longer maintained at all? I
| 
|     Hi Elliot --
| 
|     (sorry for the last post, oops). The package hasn't had any substantial
|     change since 2006. I'm cc'ing the currently listed maintainer, and also
|     the maintainer of RPostgreSQL, who might have additional insights.
| 
| 
| 
| Perhaps Dirk will comment directly but, with regards to RPosgreSQL, there is a
| google code site and an issue for dbWriteTable was already filed.
| 
| http://code.google.com/p/rpostgresql/issues/detail?id=4&can=1
| 
| The resolution is that Dirk would appreciate a patch (which I have not been
| able to supply--someone else?).

Did the OP try a current version?   NEWS follows below.  Tomoaki has done
some wonderful work and we are in much better shape than we were, say, one
year ago.

That said, there are always open issues.  See the Google Code site; and I cc
out dev list.

Dirk

Version 0.1-7 -- 2010-10-17

    o   Several potential buffer overruns were fixed

    o   dbWriteTable now writes a data.frame to database through a network
        connection rather than a temporary file. Note that row_names may be
        changed in future releases.  Also, passing in filenames instead of
        data.frame is not supported at this time. 

    o   When no host is specified, a connection to the PostgreSQL server 
        is made via UNIX domain socket (just like psql does)

    o   Table and column names are case sensitive, and identifiers are escaped
        or quoted appropriately, so that any form of table/column names can be
        created, searched, or removed, including upper-, lower- and mixed-case.

    o   nullOk in dbColumnInfo has a return value of NA when the column does
        not correspond to a column in the table. The utility of nullOk is
        doubtful but not removed at this time.

    o   Correct Windows getpid() declaration (with thanks to Brian D. Ripley)

    o   A call of as.POSIXct() with a time format string wrongly passed to TZ
        has been corrected; this should help with intra-day timestamps (with
        thanks to Steve Eick)

    o   Usage of tmpdir has been improved on similarly to Linux (with thanks
        to Robert McGehee)

Dirk


| 
| Sean
| 
|  
| 
|     Martin
| 
|     have tried RPostgreSQL, but it has an even worse problem. In that
|     package, dbWriteTable is implemented by calling write.table on the data
|     frame to write it to a temporary file, and then load it into the
|     database, which means that the formatting options set in R affect the
|     result. Specifically, write.table by default formats some numbers in
|     scientific notation, which causes the database load to fail.
|     >
|     > After looking into the problem with RdbiPgSQL a little more, it
|     > seems
|     that it's caused by dbDisconnect calling the same function that is
|     registered as the finalizer. I think it could be easily solved by using
|     an indicator variable to keep track of whether PQfinish has been called.
|     (I can also just call rm instead of dbDisconnect, but it's a little
|     disconcerting that the package contains a function that causes memory
|     errors when used as intended.)
| 
|     > - Elliot
|     >
|     >
|     > On Tue, Jan 11, 2011 at 08:12:21PM -0800, Martin Morgan wrote:
|     >> On 01/11/2011 11:37 AM, Elliot Joel Bernstein wrote:
|     >>> I am having a problem using the dbWriteTable function in package
|     RdbiPgSQL to insert data into a PostgreSQL table. Specifically, the
|     following code sometimes works, but sometimes fails with an error message
|     indicating that there is "no database connection":
|     >>>
|     >>>
|     >>>
|     >>> =====================================================================
|     >>> dbtest.R
|     >>> =====================================================================
|     >>>
|     >>> require(RdbiPgSQL)
|     >>
|     >> Hi Elliot -- not so much an answer, but unless RdbiPgSQL is central you
|     >> might try the CRAN package RPostgreSQL, which is actively maintained.
|     >>
|     >> Martin
|     >>
|     >>>
|     >>> test <- function(conn, n) {
|     >>>
|     >>>   dbSendQuery(conn, "create temp table temp (x int, y int)")
|     >>>
|     >>>   n <- 10000
|     >>>   dat <- data.frame(x=sample(1000,n,replace=T), y=sample(1000,n,replace
|     =T))
|     >>>
|     >>>   for (i in 1:dim(dat)[1]) {
|     >>>
|     >>>     err <- try(dbSendQuery(conn, sprintf("insert into temp values
|     (%d,%d)", dat$x[i], dat$y[i])))
|     >>>     if (inherits(err, "try-error")) {
|     >>>       print(conn)
|     >>>       break
|     >>>     }
|     >>>
|     >>>   }
|     >>>
|     >>>   print(dbGetQuery(conn, "select count(1) from temp"))
|     >>>
|     >>> }
|     >>>
|     >>> conn <- dbConnect(PgSQL(), dbname="gf", host="gf-host")
|     >>> test(conn, 10000)
|     >>> dbDisconnect(conn)
|     >>>
|     >>> =====================================================================
|     >>>
|     >>>
|     >>>
|     >>> The following is sample output from multiple runs of this script:
|     >>>
|     >>>
|     >>>
|     >>> =====================================================================
|     >>> Sample output
|     >>> =====================================================================
|     >>>
|     >>> R version 2.12.0 (2010-10-15)
|     >>> Copyright (C) 2010 The R Foundation for Statistical Computing
|     >>> ISBN 3-900051-07-0
|     >>> Platform: x86_64-redhat-linux-gnu (64-bit)
|     >>>
|     >>> R is free software and comes with ABSOLUTELY NO WARRANTY.
|     >>> You are welcome to redistribute it under certain conditions.
|     >>> Type 'license()' or 'licence()' for distribution details.
|     >>>
|     >>>   Natural language support but running in an English locale
|     >>>
|     >>> R is a collaborative project with many contributors.
|     >>> Type 'contributors()' for more information and
|     >>> 'citation()' on how to cite R or R packages in publications.
|     >>>
|     >>> Type 'demo()' for some demos, 'help()' for on-line help, or
|     >>> 'help.start()' for an HTML browser interface to help.
|     >>> Type 'q()' to quit R.
|     >>>
|     >>>> source("dbtest.R")
|     >>> Loading required package: RdbiPgSQL
|     >>> Loading required package: Rdbi
|     >>>   count
|     >>> 1 10000
|     >>>> source("dbtest.R")
|     >>> Error in dbSendQuery.PgSQL.conn(conn, sprintf("insert into temp values
|     (%d,%d)",  :
|     >>>   No database connection
|     >>> status = 1
|     >>> database.name =  a
|     >>> host.name = p��
|     >>> options = 8}G
|     >>> password = P�
|     >>> port = �
|     >>> tty = ��
|     >>> socket = -1
|     >>> client.encoding = -1
|     >>> backend.pid = 0
|     >>> Error in dbSendQuery.PgSQL.conn(conn, ...) : No database connection
|     >>>
|     >>> =====================================================================
|     >>>
|     >>>
|     >>>
|     >>> Here is the result of a traceback:
|     >>>
|     >>>
|     >>>
|     >>> =====================================================================
|     >>> traceback
|     >>> =====================================================================
|     >>>
|     >>>> traceback()
|     >>> 11: .Call("PgSQLsendQuery", conn, query, PACKAGE = "RdbiPgSQL")
|     >>> 10: dbSendQuery.PgSQL.conn(conn, ...) at dbtest.R#21
|     >>> 9: dbSendQuery(conn, ...) at dbtest.R#21
|     >>> 8: dbGetResult(dbSendQuery(conn, ...)) at dbtest.R#21
|     >>> 7: dbGetQuery.PgSQL.conn(conn, "select count(1) from temp") at dbtest.R
|     #21
|     >>> 6: dbGetQuery(conn, "select count(1) from temp") at dbtest.R#21
|     >>> 5: print(dbGetQuery(conn, "select count(1) from temp")) at dbtest.R#21
|     >>> 4: test(conn, 10000)
|     >>> 3: eval.with.vis(expr, envir, enclos)
|     >>> 2: eval.with.vis(ei, envir)
|     >>> 1: source("dbtest.R")
|     >>>
|     >>> =====================================================================
|     >>>
|     >>>
|     >>>
|     >>> And here is my sessionInfo:
|     >>>
|     >>>
|     >>>
|     >>> =====================================================================
|     >>> sessionInfo
|     >>> =====================================================================
|     >>>
|     >>>> sessionInfo()
|     >>> R version 2.12.0 (2010-10-15)
|     >>> Platform: x86_64-redhat-linux-gnu (64-bit)
|     >>>
|     >>> locale:
|     >>>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
|     >>>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
|     >>>  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
|     >>>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
|     >>>  [9] LC_ADDRESS=C               LC_TELEPHONE=C
|     >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
|     >>>
|     >>> attached base packages:
|     >>> [1] stats     graphics  grDevices utils     datasets  methods   base
|     >>>
|     >>> other attached packages:
|     >>> [1] RdbiPgSQL_1.24.0 Rdbi_1.24.0
|     >>>
|     >>> =====================================================================
|     >>>
|     >>>
|     >>>
|     >>> I am using version 8.4.4 of the PostgreSQL-devel libraries, and the
|     server is running version 8.4.1. The problem does not always occur on the
|     second run; I am sometimes able to run the script multiple times before it
|     eventually fails. Also, it does not always fail at the same row of the
|     dataframe, nor is there anything obviously wrong with the data values on
|     the row where it fails. I would appreciate any help in diagnosing the
|     problem.
|     >>>
|     >>> Thank you very much.
|     >>>
|     >>> - Elliot
|     >>>
|     >>> _______________________________________________
|     >>> Bioconductor mailing list
|     >>> Bioconductor at r-project.org
|     >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
|     >>> Search the archives: http://news.gmane.org/
|     gmane.science.biology.informatics.conductor
|     >>
|     >>
|     >> --
|     >> Computational Biology
|     >> Fred Hutchinson Cancer Research Center
|     >> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
|     >>
|     >> Location: M1-B861
|     >> Telephone: 206 667-2793
| 
| 
|     --
|     Computational Biology
|     Fred Hutchinson Cancer Research Center
|     1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
| 
|     Location: M1-B861
|     Telephone: 206 667-2793
| 
|     _______________________________________________
|     Bioconductor mailing list
|     Bioconductor at r-project.org
|     https://stat.ethz.ch/mailman/listinfo/bioconductor
|     Search the archives: http://news.gmane.org/
|     gmane.science.biology.informatics.conductor
| 
| 

-- 
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com



More information about the Bioconductor mailing list