[BioC] dbWriteTable Failure
Elliot Joel Bernstein
elliot.bernstein at fdopartners.com
Fri Jan 14 20:35:51 CET 2011
Dirk -
Thank you for following up on this. It's possible I'm just misunderstanding the code, but I grabbed the source for RPostgreSQL ver. 0.1-7, and as far as I can tell, the work for dbWriteTable (in postgresqlWriteTable) still works by creating a temporary file. (Lines 649-660 of PostgreSQLSupport.R.)
- Elliot
On Fri, Jan 14, 2011 at 11:56:47AM -0600, Dirk Eddelbuettel wrote:
>
> On 14 January 2011 at 12:46, Sean Davis wrote:
> |
> |
> | On Fri, Jan 14, 2011 at 12:36 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> |
> | On 01/14/2011 06:25 AM, Elliot Joel Bernstein wrote:
> | > Martin -
> | >
> | > Thanks for your reply. Is RdbiPgSQL no longer maintained at all? I
> |
> | Hi Elliot --
> |
> | (sorry for the last post, oops). The package hasn't had any substantial
> | change since 2006. I'm cc'ing the currently listed maintainer, and also
> | the maintainer of RPostgreSQL, who might have additional insights.
> |
> |
> |
> | Perhaps Dirk will comment directly but, with regards to RPosgreSQL, there is a
> | google code site and an issue for dbWriteTable was already filed.
> |
> | http://code.google.com/p/rpostgresql/issues/detail?id=4&can=1
> |
> | The resolution is that Dirk would appreciate a patch (which I have not been
> | able to supply--someone else?).
>
> Did the OP try a current version? NEWS follows below. Tomoaki has done
> some wonderful work and we are in much better shape than we were, say, one
> year ago.
>
> That said, there are always open issues. See the Google Code site; and I cc
> out dev list.
>
> Dirk
>
> Version 0.1-7 -- 2010-10-17
>
> o Several potential buffer overruns were fixed
>
> o dbWriteTable now writes a data.frame to database through a network
> connection rather than a temporary file. Note that row_names may be
> changed in future releases. Also, passing in filenames instead of
> data.frame is not supported at this time.
>
> o When no host is specified, a connection to the PostgreSQL server
> is made via UNIX domain socket (just like psql does)
>
> o Table and column names are case sensitive, and identifiers are escaped
> or quoted appropriately, so that any form of table/column names can be
> created, searched, or removed, including upper-, lower- and mixed-case.
>
> o nullOk in dbColumnInfo has a return value of NA when the column does
> not correspond to a column in the table. The utility of nullOk is
> doubtful but not removed at this time.
>
> o Correct Windows getpid() declaration (with thanks to Brian D. Ripley)
>
> o A call of as.POSIXct() with a time format string wrongly passed to TZ
> has been corrected; this should help with intra-day timestamps (with
> thanks to Steve Eick)
>
> o Usage of tmpdir has been improved on similarly to Linux (with thanks
> to Robert McGehee)
>
> Dirk
>
>
> |
> | Sean
> |
> |
> |
> | Martin
> |
> | have tried RPostgreSQL, but it has an even worse problem. In that
> | package, dbWriteTable is implemented by calling write.table on the data
> | frame to write it to a temporary file, and then load it into the
> | database, which means that the formatting options set in R affect the
> | result. Specifically, write.table by default formats some numbers in
> | scientific notation, which causes the database load to fail.
> | >
> | > After looking into the problem with RdbiPgSQL a little more, it
> | > seems
> | that it's caused by dbDisconnect calling the same function that is
> | registered as the finalizer. I think it could be easily solved by using
> | an indicator variable to keep track of whether PQfinish has been called.
> | (I can also just call rm instead of dbDisconnect, but it's a little
> | disconcerting that the package contains a function that causes memory
> | errors when used as intended.)
> |
> | > - Elliot
> | >
> | >
> | > On Tue, Jan 11, 2011 at 08:12:21PM -0800, Martin Morgan wrote:
> | >> On 01/11/2011 11:37 AM, Elliot Joel Bernstein wrote:
> | >>> I am having a problem using the dbWriteTable function in package
> | RdbiPgSQL to insert data into a PostgreSQL table. Specifically, the
> | following code sometimes works, but sometimes fails with an error message
> | indicating that there is "no database connection":
> | >>>
> | >>>
> | >>>
> | >>> =====================================================================
> | >>> dbtest.R
> | >>> =====================================================================
> | >>>
> | >>> require(RdbiPgSQL)
> | >>
> | >> Hi Elliot -- not so much an answer, but unless RdbiPgSQL is central you
> | >> might try the CRAN package RPostgreSQL, which is actively maintained.
> | >>
> | >> Martin
> | >>
> | >>>
> | >>> test <- function(conn, n) {
> | >>>
> | >>> dbSendQuery(conn, "create temp table temp (x int, y int)")
> | >>>
> | >>> n <- 10000
> | >>> dat <- data.frame(x=sample(1000,n,replace=T), y=sample(1000,n,replace
> | =T))
> | >>>
> | >>> for (i in 1:dim(dat)[1]) {
> | >>>
> | >>> err <- try(dbSendQuery(conn, sprintf("insert into temp values
> | (%d,%d)", dat$x[i], dat$y[i])))
> | >>> if (inherits(err, "try-error")) {
> | >>> print(conn)
> | >>> break
> | >>> }
> | >>>
> | >>> }
> | >>>
> | >>> print(dbGetQuery(conn, "select count(1) from temp"))
> | >>>
> | >>> }
> | >>>
> | >>> conn <- dbConnect(PgSQL(), dbname="gf", host="gf-host")
> | >>> test(conn, 10000)
> | >>> dbDisconnect(conn)
> | >>>
> | >>> =====================================================================
> | >>>
> | >>>
> | >>>
> | >>> The following is sample output from multiple runs of this script:
> | >>>
> | >>>
> | >>>
> | >>> =====================================================================
> | >>> Sample output
> | >>> =====================================================================
> | >>>
> | >>> R version 2.12.0 (2010-10-15)
> | >>> Copyright (C) 2010 The R Foundation for Statistical Computing
> | >>> ISBN 3-900051-07-0
> | >>> Platform: x86_64-redhat-linux-gnu (64-bit)
> | >>>
> | >>> R is free software and comes with ABSOLUTELY NO WARRANTY.
> | >>> You are welcome to redistribute it under certain conditions.
> | >>> Type 'license()' or 'licence()' for distribution details.
> | >>>
> | >>> Natural language support but running in an English locale
> | >>>
> | >>> R is a collaborative project with many contributors.
> | >>> Type 'contributors()' for more information and
> | >>> 'citation()' on how to cite R or R packages in publications.
> | >>>
> | >>> Type 'demo()' for some demos, 'help()' for on-line help, or
> | >>> 'help.start()' for an HTML browser interface to help.
> | >>> Type 'q()' to quit R.
> | >>>
> | >>>> source("dbtest.R")
> | >>> Loading required package: RdbiPgSQL
> | >>> Loading required package: Rdbi
> | >>> count
> | >>> 1 10000
> | >>>> source("dbtest.R")
> | >>> Error in dbSendQuery.PgSQL.conn(conn, sprintf("insert into temp values
> | (%d,%d)", :
> | >>> No database connection
> | >>> status = 1
> | >>> database.name = a
> | >>> host.name = p��
> | >>> options = 8}G
> | >>> password = P�
> | >>> port = �
> | >>> tty = ��
> | >>> socket = -1
> | >>> client.encoding = -1
> | >>> backend.pid = 0
> | >>> Error in dbSendQuery.PgSQL.conn(conn, ...) : No database connection
> | >>>
> | >>> =====================================================================
> | >>>
> | >>>
> | >>>
> | >>> Here is the result of a traceback:
> | >>>
> | >>>
> | >>>
> | >>> =====================================================================
> | >>> traceback
> | >>> =====================================================================
> | >>>
> | >>>> traceback()
> | >>> 11: .Call("PgSQLsendQuery", conn, query, PACKAGE = "RdbiPgSQL")
> | >>> 10: dbSendQuery.PgSQL.conn(conn, ...) at dbtest.R#21
> | >>> 9: dbSendQuery(conn, ...) at dbtest.R#21
> | >>> 8: dbGetResult(dbSendQuery(conn, ...)) at dbtest.R#21
> | >>> 7: dbGetQuery.PgSQL.conn(conn, "select count(1) from temp") at dbtest.R
> | #21
> | >>> 6: dbGetQuery(conn, "select count(1) from temp") at dbtest.R#21
> | >>> 5: print(dbGetQuery(conn, "select count(1) from temp")) at dbtest.R#21
> | >>> 4: test(conn, 10000)
> | >>> 3: eval.with.vis(expr, envir, enclos)
> | >>> 2: eval.with.vis(ei, envir)
> | >>> 1: source("dbtest.R")
> | >>>
> | >>> =====================================================================
> | >>>
> | >>>
> | >>>
> | >>> And here is my sessionInfo:
> | >>>
> | >>>
> | >>>
> | >>> =====================================================================
> | >>> sessionInfo
> | >>> =====================================================================
> | >>>
> | >>>> sessionInfo()
> | >>> R version 2.12.0 (2010-10-15)
> | >>> Platform: x86_64-redhat-linux-gnu (64-bit)
> | >>>
> | >>> locale:
> | >>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> | >>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> | >>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> | >>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
> | >>> [9] LC_ADDRESS=C LC_TELEPHONE=C
> | >>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> | >>>
> | >>> attached base packages:
> | >>> [1] stats graphics grDevices utils datasets methods base
> | >>>
> | >>> other attached packages:
> | >>> [1] RdbiPgSQL_1.24.0 Rdbi_1.24.0
> | >>>
> | >>> =====================================================================
> | >>>
> | >>>
> | >>>
> | >>> I am using version 8.4.4 of the PostgreSQL-devel libraries, and the
> | server is running version 8.4.1. The problem does not always occur on the
> | second run; I am sometimes able to run the script multiple times before it
> | eventually fails. Also, it does not always fail at the same row of the
> | dataframe, nor is there anything obviously wrong with the data values on
> | the row where it fails. I would appreciate any help in diagnosing the
> | problem.
> | >>>
> | >>> Thank you very much.
> | >>>
> | >>> - Elliot
> | >>>
> | >>> _______________________________________________
> | >>> Bioconductor mailing list
> | >>> Bioconductor at r-project.org
> | >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> | >>> Search the archives: http://news.gmane.org/
> | gmane.science.biology.informatics.conductor
> | >>
> | >>
> | >> --
> | >> Computational Biology
> | >> Fred Hutchinson Cancer Research Center
> | >> 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> | >>
> | >> Location: M1-B861
> | >> Telephone: 206 667-2793
> |
> |
> | --
> | Computational Biology
> | Fred Hutchinson Cancer Research Center
> | 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> |
> | Location: M1-B861
> | Telephone: 206 667-2793
> |
> | _______________________________________________
> | Bioconductor mailing list
> | Bioconductor at r-project.org
> | https://stat.ethz.ch/mailman/listinfo/bioconductor
> | Search the archives: http://news.gmane.org/
> | gmane.science.biology.informatics.conductor
> |
> |
>
> --
> Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
More information about the Bioconductor
mailing list