[R] Time and db precision
Mikkel Grum
mi2kelgrum at yahoo.com
Thu May 26 17:04:24 CEST 2011
Thanks Marc,
I had just come up with another, slightly more convoluted solution. Add as.is = TRUE to the query and then get the timetoken with
timetoken <- df$timestamp[df$timestamp == max(as.POSIX(df$timestamp))]
While it looks like options(digits.secs = 6) works, I worry that theoretically it just pushes the problem down to another level of decimal points. With the solution above, I apparently get the exact same value that was in the database.
Interestingly,
timetoken <- max(as.POSIX(df$timestamp))
does not appear to give me the same result.
Demo:
> a <- "2011-05-25 22:15:11.027116000"
> b <- "2011-05-25 22:15:11.027117000"
> c <- "2011-05-25 22:15:11.027118000"
> d <- c(a, b, c)
> d
[1] "2011-05-25 22:15:11.027116000" "2011-05-25 22:15:11.027117000" "2011-05-25 22:15:11.027118000"
> d[d == max(as.POSIXct(d))]
[1] "2011-05-25 22:15:11.027118000"
> max(as.POSIXct(d))
[1] "2011-05-25 22:15:11 COT"
--- On Thu, 5/26/11, Marc Schwartz <marc_schwartz at me.com> wrote:
> From: Marc Schwartz <marc_schwartz at me.com>
> Subject: Re: [R] Time and db precision
> To: "Mikkel Grum" <mi2kelgrum at yahoo.com>
> Cc: "R Help" <r-help at r-project.org>
> Date: Thursday, May 26, 2011, 8:22 AM
>
> On May 25, 2011, at 6:25 PM, Mikkel Grum wrote:
>
> > I have a loop that regularly checks for new data to
> analyse in my database. In order to facilitate this, the
> database table has a timestamp column with the time that the
> data was inserted into the database. Something like this:
> >
> > while (....) {
> > load(timetoken.Rdata)
> > df <- sqlQuery(con, paste("SELECT *
> FROM tabledf WHERE timestamp > ", timetoken, sep = ""))
> > analyse(df)
> > timetoken <- max(df$timestamp)
> > save(timetoken, file =
> "timetoken.Rdata")
> > Sys.sleep(60)
> > }
> >
> > Now this used to work fairly well with R and
> PostgreSQL on Windows, but on Ubuntu, I'm getting a lot of
> data being pulled up again and again. I suspect what is
> happening is that R is rounding off to the nearest second,
> while PostgreSQL is using a much higher level of precision,
> so that if no new data has come in in the meantime, chances
> are fairly high (50% ??) that the PostgreSQL timestamp is
> higher than the version that has been rounded off by R.
> >
> > Is there any way of recording the timestamp in R
> exactly as it is in PostgreSQL? Or is there another way of
> dealing with this??
> >
> > sessionInfo()
> > R version 2.11.1 (2010-05-31)
> > x86_64-pc-linux-gnu
> >
> > locale:
> > [1] C
> >
> > attached base packages:
> > [1] stats graphics
> grDevices utils datasets
> methods base
> >
> > other attached packages:
> > [1] RODBC_1.3-1
> >
> > All assistance greatly appreciated.
> >
> > Mikkel
>
>
> This query is better suited for R-SIG-DB:
>
> https://stat.ethz.ch/mailman/listinfo/r-sig-db
>
> That being said:
>
> See ?POSIXct
>
> Check the actual output of paste("SELECT * FROM tabledf
> WHERE timestamp > ", timetoken, sep = "") to see what
> value 'timetoken' is actually taking as it is used in the
> query construct. As is noted in the above help file, be sure
> that options("digits.secs") is properly set, since the
> default will be to round printed output to the nearest
> second:
>
> # A clean R session on OSX
> > options("digits.secs")
> $digits.secs
> NULL
>
> # return current date/time as POSIXct
>
> > Sys.time()
> [1] "2011-05-26 08:11:37 CDT"
>
> options(digits.secs = 6)
>
> > Sys.time()
> [1] "2011-05-26 08:12:07.080329 CDT"
>
>
> options(digits.secs = 0)
>
> > paste("SELECT * FROM tabledf WHERE timestamp > ",
> Sys.time(), sep = "")
> [1] "SELECT * FROM tabledf WHERE timestamp > 2011-05-26
> 08:15:02"
>
> options(digits.secs = 6)
>
> > paste("SELECT * FROM tabledf WHERE timestamp > ",
> Sys.time(), sep = "")
> [1] "SELECT * FROM tabledf WHERE timestamp > 2011-05-26
> 08:15:12.005103"
>
>
> HTH,
>
> Marc Schwartz
>
>
More information about the R-help
mailing list