[R] Problem(?) in strptime()

ggrothendieck@yifan.net ggrothendieck at yifan.net
Tue Apr 9 02:58:29 CEST 2002


Try this:

library(chron)
dts <- dates(c("04/07/02","04/07/02"))
tms <- times(c("01:30:00","02:30:00"))
x <- chron(dts,tms)
y <- as.POSIXct(x,tz="GMT")
y   # returns date/times
y[2]-y[1]   # returns a difference of 1 hour





On 8 Apr 2002 at 12:33, Don MacQueen wrote:

> I think the following examples illustrate the crux of the matter 
> (version and OS info are below).
> 
> The problem has to do with the transition from standard time to 
> daylight savings time. My timezone, US/Pacific, has two parts: 
> standard time (PST) 8 hours behind GMT and daylight savings time 
> (PDT) 7 hours behind GMT. The transition takes place this year on 7 
> April at 02:00, when 02:00 is re-labeled 03:00.
> 
> ## April 6, 01:30 and 02:30
> >  ISOdatetime(2002, 4, 6, 1:2, 30, 0,tz='GMT')
> [1] "2002-04-05 17:30:00 PST" "2002-04-05 18:30:00 PST"
> 
> ## April 7 , 01:30 and 02:30
> >  ISOdatetime(2002, 4, 7, 1:2, 30, 0,tz='GMT')
> [1] "2002-04-06 17:30:00 PST" "2002-04-06 17:30:00 PST"
> 
> The dates supplied are one day apart. The times supplied are one hour 
> apart. However, the times returned are one hour apart in the first 
> case, but identical in the second case.
> 
> >  tmp <- ISOdatetime(2002, 4, 7, 1:2, 30, 0,tz='GMT')
> >  identical(tmp[1],tmp[2])
> [1] TRUE
> 
> Of the four values returned, the last one is incorrect, because 
> 2002-4-7 2:30 GMT truly is 2002-4-6 18:30 PST.
> 
> What I need is a way to have that fourth case interpreted correctly.
> 
> 
> Investigating a bit:
> 
> >  ISOdatetime
> function (year, month, day, hour, min, sec, tz = "")
> {
>      x <- paste(year, month, day, hour, min, sec)
>      as.POSIXct(strptime(x, "%Y %m %d %H %M %S"), tz = tz)
> }
> 
> >  strptime
> function (x, format)
> .Internal(strptime(x, format))
> 
> ISOdatetime() uses strptime(), and strptime() does not use the 
> timezone information. Indeed, from ?strptime, TZ as part of a format 
> specification is available for output only.
> 
> As far as I can tell, strptime() interprets everything in the local 
> timezone, and when provided a time such as 2002-4-7 2:30 that 
> "doesn't exist" in the local timezone, makes a reasonable attempt to 
> guess what the user meant. But it doesn't work for what ISOdatetime() 
> does with it when tz is something other than ''.
> 
> What I need is a way to tell R that the date-time string really truly 
> should be interpreted as GMT. I haven't found a way. (maybe setenv TZ 
> GMT before starting R, but I'm still exploring that)
> 
> Also as far as I can tell, strptime() uses an OS-supplied strptime if 
> one is available, and R is entirely dependent on its behavior. I 
> don't entirely understand what man strptime on my system says about 
> this, but maybe it suggests that timezone information might be used 
> if provided...
> 
>       %Z    Timezone name or no characters if no time zone  infor-
>             mation  exists.  Local timezone information is used as
>             though  strptime()  called  tzset()  (see  ctime(3C)).
>             Errors  may not be detected.  This behavior is subject
>             to change in a future release.
> 
> 
> >  Sys.getlocale()
> [1] "C"
> >  version
> 
> >  Sys.getenv('TZ')
>            TZ
> "US/Pacific"
>         _
> platform sparc-sun-solaris2.7
> arch     sparc
> os       solaris2.7
> system   sparc, solaris2.7
> status
> major    1
> minor    4.1
> year     2002
> month    01
> day      30
> language R
> 
> I tried changing the locale
>     Sys.setlocale('LC_TIME','en_GB')
> (based on entries in /usr/lib/locale/lcttab), and
>     Sys.putenv('TZ=GMT')
> to no avail.
> 
> ----------------------------------------------
> This whole thing is motivated by the fact that I am receiving some 
> data that is time-stamped, and the time stamps (in addition to having 
> a poorly chosen format) ignore the daylight savings time convention. 
> That is, they always use an 8 hour offset from GMT. Thus, the three 
> times shown are in fact an hourly sequence.
> 
> Sun Apr 07 01:30:58 2002
> Sun Apr 07 02:30:58 2002
> Sun Apr 07 03:30:58 2002
> 
> In order to convert these correctly to POSIXct, I thought a 
> reasonable approach would be to tell R that they are in GMT, read 
> them as such, and then convert to US/Pacific.
> 
> Here is what I have been using.
> 
> tmpd <- c('Sun Apr 07 01:30:58 2002',
>            'Sun Apr 07 02:30:58 2002',
>            'Sun Apr 07 03:30:58 2002')
> tmpt <- as.POSIXct(strptime(tmpd,'%a %b %d %H:%M:%S %Y'),tz='GMT')+28800
> 
> >  tmpt
> [1] "2002-04-07 01:30:58 PST" "2002-04-07 01:30:58 PST" "2002-04-07 
> 04:30:58 PDT"
> 
> It works for the first and last times, but not the middle one
> (3:30 "PST" = 4:30 PDT is correct, but 2:30 "PST" should be 3:30 PDT).
> 
> I would appreciate help finding a way that works for all of them 
> simultaneously.
> 
> Thanks
> -Don
> 
> 
> -----------------
> Here are some more attempts at various ways of looking at these 
> dates, if anyone cares to wade through them.
> 
> >  strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M')
> [1] "2002-04-07 01:30:00"
> >  strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M')
> [1] "2002-04-07 01:30:00"
> >  strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M')
> [1] "2002-04-07 03:30:00"
> 
> The first and last display as two hours apart.
> The second one is interpreted by strptime() to be the same as the 
> first one. Not unreasonable, but problematic as illustrated above.
> 
> -------
> >  as.numeric(as.POSIXct(strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M')))
> [1] 1018171800
> >  as.numeric(as.POSIXct(strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M')))
> [1] 1018171800
> >  as.numeric(as.POSIXct(strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M')))
> [1] 101817540
> 
> >  1018175400 - 1018171800
> [1] 3600
> 
> But in fact, the first and last are only one hour apart. This is 
> correct, because the first one is PST, the third one is PDT.
> 
> -------
> >  as.POSIXct(strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M'),tz='GMT')
> [1] "2002-04-06 17:30:00 PST"
> >  as.POSIXct(strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M'),tz='GMT')
> [1] "2002-04-06 17:30:00 PST"
> >  as.POSIXct(strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M'),tz='GMT')
> [1] "2002-04-06 19:30:00 PST"
> 
> >  as.POSIXct(strptime('2002-4-7 1:30' , '%Y-%m-%d %H:%M'),tz='US/Pacific')
> [1] "2002-04-07 01:30:00 PST"
> >  as.POSIXct(strptime('2002-4-7 2:30' , '%Y-%m-%d %H:%M'),tz='US/Pacific')
> [1] "2002-04-07 01:30:00 PST"
> >  as.POSIXct(strptime('2002-4-7 3:30' , '%Y-%m-%d %H:%M'),tz='US/Pacific')
> [1] "2002-04-07 03:30:00 PDT"
> 
> -- 
> --------------------------------------
> Don MacQueen
> Environmental Protection Department
> Lawrence Livermore National Laboratory
> Livermore, CA, USA
> --------------------------------------
> -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
> _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> 


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list