[BioC] dbWriteTable Failure
Dirk Eddelbuettel
edd at debian.org
Fri Jan 14 20:54:12 CET 2011
Elliot,
On 14 January 2011 at 14:35, Elliot Joel Bernstein wrote:
| Dirk -
|
| Thank you for following up on this. It's possible I'm just misunderstanding the code, but I grabbed the source for RPostgreSQL ver. 0.1-7, and as far as I can tell, the work for dbWriteTable (in postgresqlWriteTable) still works by creating a temporary file. (Lines 649-660 of PostgreSQLSupport.R.)
Hm. Looking at the code, I see that you have a point there. Maybe Tomoaki
can clarify.
We did move a couple of bugs out of the way and I was under the impression
that this had been taken care of too. If not, well, you always have your
money back guarantee.... ;-)
And in case that wasn't patently obvious, we *do* welcome welcome patches.
Cheers, Dirk
|
| - Elliot
|
| On Fri, Jan 14, 2011 at 11:56:47AM -0600, Dirk Eddelbuettel wrote:
| >
| > On 14 January 2011 at 12:46, Sean Davis wrote:
| > |
| > |
| > | On Fri, Jan 14, 2011 at 12:36 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
| > |
| > | On 01/14/2011 06:25 AM, Elliot Joel Bernstein wrote:
| > | > Martin -
| > | >
| > | > Thanks for your reply. Is RdbiPgSQL no longer maintained at all? I
| > |
| > | Hi Elliot --
| > |
| > | (sorry for the last post, oops). The package hasn't had any substantial
| > | change since 2006. I'm cc'ing the currently listed maintainer, and also
| > | the maintainer of RPostgreSQL, who might have additional insights.
| > |
| > |
| > |
| > | Perhaps Dirk will comment directly but, with regards to RPosgreSQL, there is a
| > | google code site and an issue for dbWriteTable was already filed.
| > |
| > | http://code.google.com/p/rpostgresql/issues/detail?id=4&can=1
| > |
| > | The resolution is that Dirk would appreciate a patch (which I have not been
| > | able to supply--someone else?).
| >
| > Did the OP try a current version? NEWS follows below. Tomoaki has done
| > some wonderful work and we are in much better shape than we were, say, one
| > year ago.
| >
| > That said, there are always open issues. See the Google Code site; and I cc
| > out dev list.
| >
| > Dirk
| >
| > Version 0.1-7 -- 2010-10-17
| >
| > o Several potential buffer overruns were fixed
| >
| > o dbWriteTable now writes a data.frame to database through a network
| > connection rather than a temporary file. Note that row_names may be
| > changed in future releases. Also, passing in filenames instead of
| > data.frame is not supported at this time.
| >
| > o When no host is specified, a connection to the PostgreSQL server
| > is made via UNIX domain socket (just like psql does)
| >
| > o Table and column names are case sensitive, and identifiers are escaped
| > or quoted appropriately, so that any form of table/column names can be
| > created, searched, or removed, including upper-, lower- and mixed-case.
| >
| > o nullOk in dbColumnInfo has a return value of NA when the column does
| > not correspond to a column in the table. The utility of nullOk is
| > doubtful but not removed at this time.
| >
| > o Correct Windows getpid() declaration (with thanks to Brian D. Ripley)
| >
| > o A call of as.POSIXct() with a time format string wrongly passed to TZ
| > has been corrected; this should help with intra-day timestamps (with
| > thanks to Steve Eick)
| >
| > o Usage of tmpdir has been improved on similarly to Linux (with thanks
| > to Robert McGehee)
| >
| > Dirk
| >
| >
| > |
| > | Sean
| > |
| > |
| > |
| > | Martin
| > |
| > | have tried RPostgreSQL, but it has an even worse problem. In that
| > | package, dbWriteTable is implemented by calling write.table on the data
| > | frame to write it to a temporary file, and then load it into the
| > | database, which means that the formatting options set in R affect the
| > | result. Specifically, write.table by default formats some numbers in
| > | scientific notation, which causes the database load to fail.
| > | >
| > | > After looking into the problem with RdbiPgSQL a little more, it
| > | > seems
| > | that it's caused by dbDisconnect calling the same function that is
| > | registered as the finalizer. I think it could be easily solved by using
| > | an indicator variable to keep track of whether PQfinish has been called.
| > | (I can also just call rm instead of dbDisconnect, but it's a little
| > | disconcerting that the package contains a function that causes memory
| > | errors when used as intended.)
| > |
| > | > - Elliot
| > | >
| > | >
| > | > On Tue, Jan 11, 2011 at 08:12:21PM -0800, Martin Morgan wrote:
| > | >> On 01/11/2011 11:37 AM, Elliot Joel Bernstein wrote:
| > | >>> I am having a problem using the dbWriteTable function in package
| > | RdbiPgSQL to insert data into a PostgreSQL table. Specifically, the
| > | following code sometimes works, but sometimes fails with an error message
| > | indicating that there is "no database connection":
| > | >>>
| > | >>>
| > | >>>
| > | >>> =====================================================================
| > | >>> dbtest.R
| > | >>> =====================================================================
| > | >>>
| > | >>> require(RdbiPgSQL)
| > | >>
| > | >> Hi Elliot -- not so much an answer, but unless RdbiPgSQL is central you
| > | >> might try the CRAN package RPostgreSQL, which is actively maintained.
| > | >>
| > | >> Martin
| > | >>
| > | >>>
| > | >>> test <- function(conn, n) {
| > | >>>
| > | >>> dbSendQuery(conn, "create temp table temp (x int, y int)")
| > | >>>
| > | >>> n <- 10000
| > | >>> dat <- data.frame(x=sample(1000,n,replace=T), y=sample(1000,n,replace
| > | =T))
| > | >>>
| > | >>> for (i in 1:dim(dat)[1]) {
| > | >>>
| > | >>> err <- try(dbSendQuery(conn, sprintf("insert into temp values
| > | (%d,%d)", dat$x[i], dat$y[i])))
| > | >>> if (inherits(err, "try-error")) {
| > | >>> print(conn)
| > | >>> break
| > | >>> }
| > | >>>
| > | >>> }
| > | >>>
| > | >>> print(dbGetQuery(conn, "select count(1) from temp"))
| > | >>>
| > | >>> }
| > | >>>
| > | >>> conn <- dbConnect(PgSQL(), dbname="gf", host="gf-host")
| > | >>> test(conn, 10000)
| > | >>> dbDisconnect(conn)
| > | >>>
| > | >>> =====================================================================
| > | >>>
| > | >>>
| > | >>>
| > | >>> The following is sample output from multiple runs of this script:
| > | >>>
| > | >>>
| > | >>>
| > | >>> =====================================================================
| > | >>> Sample output
| > | >>> =====================================================================
| > | >>>
| > | >>> R version 2.12.0 (2010-10-15)
| > | >>> Copyright (C) 2010 The R Foundation for Statistical Computing
| > | >>> ISBN 3-900051-07-0
| > | >>> Platform: x86_64-redhat-linux-gnu (64-bit)
| > | >>>
| > | >>> R is free software and comes with ABSOLUTELY NO WARRANTY.
| > | >>> You are welcome to redistribute it under certain conditions.
| > | >>> Type 'license()' or 'licence()' for distribution details.
| > | >>>
| > | >>> Natural language support but running in an English locale
| > | >>>
| > | >>> R is a collaborative project with many contributors.
| > | >>> Type 'contributors()' for more information and
| > | >>> 'citation()' on how to cite R or R packages in publications.
| > | >>>
| > | >>> Type 'demo()' for some demos, 'help()' for on-line help, or
| > | >>> 'help.start()' for an HTML browser interface to help.
| > | >>> Type 'q()' to quit R.
| > | >>>
| > | >>>> source("dbtest.R")
| > | >>> Loading required package: RdbiPgSQL
| > | >>> Loading required package: Rdbi
| > | >>> count
| > | >>> 1 10000
| > | >>>> source("dbtest.R")
| > | >>> Error in dbSendQuery.PgSQL.conn(conn, sprintf("insert into temp values
| > | (%d,%d)", :
| > | >>> No database connection
| > | >>> status = 1
| > | >>> database.name = a
| > | >>> host.name = p��
| > | >>> options = 8}G
| > | >>> password = P�
| > | >>> port = �
| > | >>> tty = ��
| > | >>> socket = -1
| > | >>> client.encoding = -1
| > | >>> backend.pid = 0
| > | >>> Error in dbSendQuery.PgSQL.conn(conn, ...) : No database connection
| > | >>>
| > | >>> =====================================================================
| > | >>>
| > | >>>
| > | >>>
| > | >>> Here is the result of a traceback:
| > | >>>
| > | >>>
| > | >>>
| > | >>> =====================================================================
| > | >>> traceback
| > | >>> =====================================================================
| > | >>>
| > | >>>> traceback()
| > | >>> 11: .Call("PgSQLsendQuery", conn, query, PACKAGE = "RdbiPgSQL")
| > | >>> 10: dbSendQuery.PgSQL.conn(conn, ...) at dbtest.R#21
| > | >>> 9: dbSendQuery(conn, ...) at dbtest.R#21
| > | >>> 8: dbGetResult(dbSendQuery(conn, ...)) at dbtest.R#21
| > | >>> 7: dbGetQuery.PgSQL.conn(conn, "select count(1) from temp") at dbtest.R
| > | #21
| > | >>> 6: dbGetQuery(conn, "select count(1) from temp") at dbtest.R#21
| > | >>> 5: print(dbGetQuery(conn, "select count(1) from temp")) at dbtest.R#21
| > | >>> 4: test(conn, 10000)
| > | >>> 3: eval.with.vis(expr, envir, enclos)
| > | >>> 2: eval.with.vis(ei, envir)
| > | >>> 1: source("dbtest.R")
| > | >>>
| > | >>> =====================================================================
| > | >>>
| > | >>>
| > | >>>
| > | >>> And here is my sessionInfo:
| > | >>>
| > | >>>
| > | >>>
| > | >>> =====================================================================
| > | >>> sessionInfo
| > | >>> =====================================================================
| > | >>>
| > | >>>> sessionInfo()
| > | >>> R version 2.12.0 (2010-10-15)
| > | >>> Platform: x86_64-redhat-linux-gnu (64-bit)
| > | >>>
| > | >>> locale:
| > | >>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
| > | >>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
| > | >>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
| > | >>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
| > | >>> [9] LC_ADDRESS=C LC_TELEPHONE=C
| > | >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
| > | >>>
| > | >>> attached base packages:
| > | >>> [1] stats graphics grDevices utils datasets methods base
| > | >>>
| > | >>> other attached packages:
| > | >>> [1] RdbiPgSQL_1.24.0 Rdbi_1.24.0
| > | >>>
| > | >>> =====================================================================
| > | >>>
| > | >>>
| > | >>>
| > | >>> I am using version 8.4.4 of the PostgreSQL-devel libraries, and the
| > | server is running version 8.4.1. The problem does not always occur on the
| > | second run; I am sometimes able to run the script multiple times before it
| > | eventually fails. Also, it does not always fail at the same row of the
| > | dataframe, nor is there anything obviously wrong with the data values on
| > | the row where it fails. I would appreciate any help in diagnosing the
| > | problem.
| > | >>>
| > | >>> Thank you very much.
| > | >>>
| > | >>> - Elliot
| > | >>>
| > | >>> _______________________________________________
| > | >>> Bioconductor mailing list
| > | >>> Bioconductor at r-project.org
| > | >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
| > | >>> Search the archives: http://news.gmane.org/
| > | gmane.science.biology.informatics.conductor
| > | >>
| > | >>
| > | >> --
| > | >> Computational Biology
| > | >> Fred Hutchinson Cancer Research Center
| > | >> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
| > | >>
| > | >> Location: M1-B861
| > | >> Telephone: 206 667-2793
| > |
| > |
| > | --
| > | Computational Biology
| > | Fred Hutchinson Cancer Research Center
| > | 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
| > |
| > | Location: M1-B861
| > | Telephone: 206 667-2793
| > |
| > | _______________________________________________
| > | Bioconductor mailing list
| > | Bioconductor at r-project.org
| > | https://stat.ethz.ch/mailman/listinfo/bioconductor
| > | Search the archives: http://news.gmane.org/
| > | gmane.science.biology.informatics.conductor
| > |
| > |
| >
| > --
| > Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
--
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
More information about the Bioconductor
mailing list