[R] Time Zone problems: midnight goes in; 8am comes out

Andrew Simmons @kw@|mmo @end|ng |rom gm@||@com
Wed Mar 2 03:50:31 CET 2022


It seems like the current version of lubridate is 1.8.0, which does
raise a warning for an invalid timezone, just like as.POSIXct. This is
what I tried:


print(lubridate::parse_date_time("1970-01-01 00:01:00",          "ymd
HMS"          , tz = "PST"))
print(as.POSIXct                ("1970-01-01 00:01:00", format =
"%Y-%m-%d %H:%M:%S", tz = "PST"))


outputs:


> print(lubridate::parse_date_time("1970-01-01 00:01:00",          "ymd HMS"          , tz = "PST"))
[1] "1970-01-01 08:01:00 GMT"
Warning message:
In as.POSIXlt.POSIXct(x, tz) : unknown timezone 'PST'
> print(as.POSIXct                ("1970-01-01 00:01:00", format = "%Y-%m-%d %H:%M:%S", tz = "PST"))
[1] "1970-01-01 00:01:00 GMT"
Warning messages:
1: In strptime(x, format, tz = tz) : unknown timezone 'PST'
2: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
  unknown timezone 'PST'
3: In as.POSIXlt.POSIXct(x, tz) : unknown timezone 'PST'
>


But I don't see the same problem when using `tz =
"America/Los_Angeles"` or `tz = "PST8PDT"`.


print(lubridate::parse_date_time("1970-01-01 00:01:00",          "ymd
HMS"          , tz = "PST8PDT"))
print(as.POSIXct                ("1970-01-01 00:01:00", format =
"%Y-%m-%d %H:%M:%S", tz = "PST8PDT"))


outputs:


> print(lubridate::parse_date_time("1970-01-01 00:01:00",          "ymd HMS"          , tz = "PST8PDT"))
[1] "1970-01-01 00:01:00 PST"
> print(as.POSIXct                ("1970-01-01 00:01:00", format = "%Y-%m-%d %H:%M:%S", tz = "PST8PDT"))
[1] "1970-01-01 00:01:00 PST"
>


I would hesitate to use `tz = Sys.timezone()` because someone from
another province/state might not be able to use your code. Depends on
whether this work is being shared with other people though, up to you.

On Tue, Mar 1, 2022 at 8:51 PM Boylan, Ross via R-help
<r-help using r-project.org> wrote:
>
> I'm having problems with timezones using lubridate, but it's not clear to me the difficulty is in lubridate.
> ---------------------------------
> > r2 <- parse_date_time("1970-01-01 00:01:00", "ymd HMS", tz="PST")
> > r2
> [1] "1970-01-01 08:01:00 PST"  ## Oops: midnight has turned in 8am
> > as.numeric(r2)
> [1] 28860
> > 8*3600 # seconds in 8 hours
> [1] 28800
> ------------------------------------
> lubridate accepts PST as the time zone, and the result prints "PST" for timezone.  Further, lubridate seems to be using the tz properly since it gets the 8 hour offset from UTC correct.
>
> The problem is the value that is printed gives a UTC time of 08:01 despite having the PST suffix.  So the time appears to have jumped 8 hours ahead from the value parsed.
>
> PST appears not to be a legal timezone (in spite of lubridate inferring the correct offset from it):
> ---------------------------------------------------
> > Sys.timezone()
> [1] "America/Los_Angeles"
>
> > (grep("PST", OlsonNames(), value=TRUE))
> [1] "PST8PDT"         "SystemV/PST8"    "SystemV/PST8PDT"
> -------------------------------------
> https://www.r-bloggers.com/2018/07/a-tour-of-timezones-troubles-in-r/ says lubridate will complain if given an invalid tz, though I don't see that explicitly in the current man page https://lubridate.tidyverse.org/reference/parse_date_time.html.  As shown above, parse_date_time() does not complain about the timezone, and does use it to get the correct offset.
>
> Using America/Los_Angeles produces the expected results:
> ---------------------------------------
> > r4 <- parse_date_time("1970-01-01 00:01:00", "ymd HMS", tz=Sys.timezone())
> > r4
> [1] "1970-01-01 00:01:00 PST"  # still prints PST.  This time it's true!
> > as.numeric(r4)
> [1] 28860
> ----------------------------------------------------
>
> I suppose I can just use "America/Los_Angeles" as the time zone; this would have the advantage of making all my timezones the same, which apparently what R requires for a vector of datetimes.  But the behavior seems odd, and the "fix" also requires me to ignore the time zone specified in my inputs, which look like "2022-03-01 15:54:30 PST" or PDT, depending on time of year.
>
> 1. Why this strange behavior in which PST or PDT is used to construct the proper offset from UTC, and then kind of forgotten on output?
> 2. Is this a bug in lubridate or base POSIXct, particularly its print routine?
>
> My theory on 1 is that lubridate understands PST and constructs an appropriate UTC time.  POSIXct time does not understand a tz of "PST" and so prints out the UTC value for the time, "decorating" it with the not understood tz value.
>
> For 2, on one hand, lubridate is constructing POSIXct dates with invalid tz values; lubridate probably shouldn't.  On the other hand, POSIXct is printing a UTC time but labeling it with a tz it doesn't understand, so it looks if it's in that local time even though it isn't.  In the context above that seems like a bug, but it's possible a lot of code that depends on it.
>
> Under these theories, the problems only arise because the set of tz values understood by lubridate differs from the tz value understood by POSIXct.
>
> Versions:
> R 3.5.2
> lubridate 1.7.4
> Debian GNU/Linux 10 aka buster (amd64 flavor)
>
> Thanks.
> Ross Boylan
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list