[R] Problem(?) in strptime()
ggrothendieck@yifan.net
ggrothendieck at yifan.net
Tue Apr 9 02:58:29 CEST 2002
Try this:
library(chron)
dts <- dates(c("04/07/02","04/07/02"))
tms <- times(c("01:30:00","02:30:00"))
x <- chron(dts,tms)
y <- as.POSIXct(x,tz="GMT")
y # returns date/times
y[2]-y[1] # returns a difference of 1 hour
On 8 Apr 2002 at 12:33, Don MacQueen wrote:
> I think the following examples illustrate the crux of the matter
> (version and OS info are below).
>
> The problem has to do with the transition from standard time to
> daylight savings time. My timezone, US/Pacific, has two parts:
> standard time (PST) 8 hours behind GMT and daylight savings time
> (PDT) 7 hours behind GMT. The transition takes place this year on 7
> April at 02:00, when 02:00 is re-labeled 03:00.
>
> ## April 6, 01:30 and 02:30
> > ISOdatetime(2002, 4, 6, 1:2, 30, 0,tz='GMT')
> [1] "2002-04-05 17:30:00 PST" "2002-04-05 18:30:00 PST"
>
> ## April 7 , 01:30 and 02:30
> > ISOdatetime(2002, 4, 7, 1:2, 30, 0,tz='GMT')
> [1] "2002-04-06 17:30:00 PST" "2002-04-06 17:30:00 PST"
>
> The dates supplied are one day apart. The times supplied are one hour
> apart. However, the times returned are one hour apart in the first
> case, but identical in the second case.
>
> > tmp <- ISOdatetime(2002, 4, 7, 1:2, 30, 0,tz='GMT')
> > identical(tmp[1],tmp[2])
> [1] TRUE
>
> Of the four values returned, the last one is incorrect, because
> 2002-4-7 2:30 GMT truly is 2002-4-6 18:30 PST.
>
> What I need is a way to have that fourth case interpreted correctly.
>
>
> Investigating a bit:
>
> > ISOdatetime
> function (year, month, day, hour, min, sec, tz = "")
> {
> x <- paste(year, month, day, hour, min, sec)
> as.POSIXct(strptime(x, "%Y %m %d %H %M %S"), tz = tz)
> }
>
> > strptime
> function (x, format)
> .Internal(strptime(x, format))
>
> ISOdatetime() uses strptime(), and strptime() does not use the
> timezone information. Indeed, from ?strptime, TZ as part of a format
> specification is available for output only.
>
> As far as I can tell, strptime() interprets everything in the local
> timezone, and when provided a time such as 2002-4-7 2:30 that
> "doesn't exist" in the local timezone, makes a reasonable attempt to
> guess what the user meant. But it doesn't work for what ISOdatetime()
> does with it when tz is something other than ''.
>
> What I need is a way to tell R that the date-time string really truly
> should be interpreted as GMT. I haven't found a way. (maybe setenv TZ
> GMT before starting R, but I'm still exploring that)
>
> Also as far as I can tell, strptime() uses an OS-supplied strptime if
> one is available, and R is entirely dependent on its behavior. I
> don't entirely understand what man strptime on my system says about
> this, but maybe it suggests that timezone information might be used
> if provided...
>
> %Z Timezone name or no characters if no time zone infor-
> mation exists. Local timezone information is used as
> though strptime() called tzset() (see ctime(3C)).
> Errors may not be detected. This behavior is subject
> to change in a future release.
>
>
> > Sys.getlocale()
> [1] "C"
> > version
>
> > Sys.getenv('TZ')
> TZ
> "US/Pacific"
> _
> platform sparc-sun-solaris2.7
> arch sparc
> os solaris2.7
> system sparc, solaris2.7
> status
> major 1
> minor 4.1
> year 2002
> month 01
> day 30
> language R
>
> I tried changing the locale
> Sys.setlocale('LC_TIME','en_GB')
> (based on entries in /usr/lib/locale/lcttab), and
> Sys.putenv('TZ=GMT')
> to no avail.
>
> ----------------------------------------------
> This whole thing is motivated by the fact that I am receiving some
> data that is time-stamped, and the time stamps (in addition to having
> a poorly chosen format) ignore the daylight savings time convention.
> That is, they always use an 8 hour offset from GMT. Thus, the three
> times shown are in fact an hourly sequence.
>
> Sun Apr 07 01:30:58 2002
> Sun Apr 07 02:30:58 2002
> Sun Apr 07 03:30:58 2002
>
> In order to convert these correctly to POSIXct, I thought a
> reasonable approach would be to tell R that they are in GMT, read
> them as such, and then convert to US/Pacific.
>
> Here is what I have been using.
>
> tmpd <- c('Sun Apr 07 01:30:58 2002',
> 'Sun Apr 07 02:30:58 2002',
> 'Sun Apr 07 03:30:58 2002')
> tmpt <- as.POSIXct(strptime(tmpd,'%a %b %d %H:%M:%S %Y'),tz='GMT')+28800
>
> > tmpt
> [1] "2002-04-07 01:30:58 PST" "2002-04-07 01:30:58 PST" "2002-04-07
> 04:30:58 PDT"
>
> It works for the first and last times, but not the middle one
> (3:30 "PST" = 4:30 PDT is correct, but 2:30 "PST" should be 3:30 PDT).
>
> I would appreciate help finding a way that works for all of them
> simultaneously.
>
> Thanks
> -Don
>
>
> -----------------
> Here are some more attempts at various ways of looking at these
> dates, if anyone cares to wade through them.
>
> > strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M')
> [1] "2002-04-07 01:30:00"
> > strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M')
> [1] "2002-04-07 01:30:00"
> > strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M')
> [1] "2002-04-07 03:30:00"
>
> The first and last display as two hours apart.
> The second one is interpreted by strptime() to be the same as the
> first one. Not unreasonable, but problematic as illustrated above.
>
> -------
> > as.numeric(as.POSIXct(strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M')))
> [1] 1018171800
> > as.numeric(as.POSIXct(strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M')))
> [1] 1018171800
> > as.numeric(as.POSIXct(strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M')))
> [1] 101817540
>
> > 1018175400 - 1018171800
> [1] 3600
>
> But in fact, the first and last are only one hour apart. This is
> correct, because the first one is PST, the third one is PDT.
>
> -------
> > as.POSIXct(strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M'),tz='GMT')
> [1] "2002-04-06 17:30:00 PST"
> > as.POSIXct(strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M'),tz='GMT')
> [1] "2002-04-06 17:30:00 PST"
> > as.POSIXct(strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M'),tz='GMT')
> [1] "2002-04-06 19:30:00 PST"
>
> > as.POSIXct(strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M'),tz='US/Pacific')
> [1] "2002-04-07 01:30:00 PST"
> > as.POSIXct(strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M'),tz='US/Pacific')
> [1] "2002-04-07 01:30:00 PST"
> > as.POSIXct(strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M'),tz='US/Pacific')
> [1] "2002-04-07 03:30:00 PDT"
>
> --
> --------------------------------------
> Don MacQueen
> Environmental Protection Department
> Lawrence Livermore National Laboratory
> Livermore, CA, USA
> --------------------------------------
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list