[R-pkg-devel] Issue handling datetimes: possible differences between computers
Simon Urbanek
@|mon@urb@nek @end|ng |rom R-project@org
Mon Oct 10 03:57:06 CEST 2022
Alexandre,
it's better to parse the timestamp in correct timezone:
> foo = as.POSIXlt("2021-10-01", "UTC")
> as.POSIXct(as.character(foo), "Europe/Berlin")
[1] "2021-10-01 CEST"
The issue stems from the fact that you are pretending like your timestamp is UTC (which it is not) while you want to interpret the same values in a different time zone. The DST flags varies depending on the day (due to DST being 0 or 1 depending on the date) and POSIXlt does not have that information since you only attached the time zone without updating it:
> str(unclass(as.POSIXlt(foo, "Europe/Berlin")))
List of 9
$ sec : num 0
$ min : int 0
$ hour : int 0
$ mday : int 1
$ mon : int 9
$ year : int 121
$ wday : int 5
$ yday : int 273
$ isdst: int 0
- attr(*, "tzone")= chr "Europe/Berlin"
note that isdst is 0 from the UTC entry (which doesn't have DST) even though that date is actually DST in CEST. Compare that to the correctly parsed POSIXlt:
> str(unclass(as.POSIXlt(as.character(foo), "Europe/Berlin")))
List of 11
$ sec : num 0
$ min : int 0
$ hour : int 0
$ mday : int 1
$ mon : int 9
$ year : int 121
$ wday : int 5
$ yday : int 273
$ isdst : int 1
$ zone : chr "CEST"
$ gmtoff: int NA
- attr(*, "tzone")= chr "Europe/Berlin"
where isdst is 1 since it is indeed the DST. The OS difference seems to be that Linux respects the isdst information from POSIXlt while Windows and macOS ignores it. This behavior is documented:
At all other times ‘isdst’ can be deduced from the
first six values, but the behaviour if it is set incorrectly is
platform-dependent.
You can re-set isdst to -1 to make sure R will try to determine it:
> foo$isdst = -1L
> as.POSIXct(foo, "Europe/Berlin")
[1] "2021-10-01 CEST"
So, generally, you cannot simply change the time zone in POSIXlt - don't pretend the time is in UTC if it's not, you have to re-parse or re-compute the timestamps for it to be reliable or else the DST flag will be wrong.
Cheers,
Simon
> On 10/10/2022, at 1:14 AM, Alexandre Courtiol <alexandre.courtiol using gmail.com> wrote:
>
> Hi R pkg developers,
>
> We are facing a datetime handling issue which manifests itself in a
> package we are working on.
>
> In context, we noticed that reading datetime info from an excel file
> resulted in different data depending on the computer we used.
>
> We are aware that timezone and regional settings are general sources
> of troubles, but the code we are using was trying to circumvent this.
> We went only as far as figuring out that the issue happens when
> converting a POSIXlt into a POSIXct.
>
> Please find below, a minimal reproducible example where `foo` is
> converted to `bar` on two different computers.
> `foo` is a POSIXlt with a defined time zone and upon conversion to a
> POSIXct, despite using a set time zone, we end up with `bar` being
> different on Linux and on a Windows machine.
>
> We noticed that the difference emerges from the system call
> `.Internal(as.POSIXct())` within `as.POSIXct.POSIXlt()`.
> We also noticed that the internal function in R actually calls
> getenv("TZ") within C, which is probably what explains where the
> difference comes from.
>
> Such a behaviour is probably expected and not a bug, but what would be
> the strategy to convert a POSIXlt into a POSIXct that would not be
> machine dependent?
>
> We finally noticed that depending on the datetime used as a starting
> point and on the time zone used when calling `as.POSIXct()`, we
> sometimes have a difference between computers and sometimes not...
> which adds to our puzzlement.
>
> Many thanks.
> Alex & Liam
>
>
> ``` r
> ## On Linux
> foo <- structure(list(sec = 0, min = 0L, hour = 0L, mday = 1L, mon =
> 9L, year = 121L, wday = 5L, yday = 273L, isdst = 0L),
> class = c("POSIXlt", "POSIXt"), tzone = "UTC")
>
> bar <- as.POSIXct(foo, tz = "Europe/Berlin")
>
> bar
> #> [1] "2021-10-01 01:00:00 CEST"
>
> dput(bar)
> #> structure(1633042800, class = c("POSIXct", "POSIXt"), tzone =
> "Europe/Berlin")
> ```
>
> ``` r
> ## On Windows
> foo <- structure(list(sec = 0, min = 0L, hour = 0L, mday = 1L, mon =
> 9L, year = 121L, wday = 5L, yday = 273L, isdst = 0L),
> class = c("POSIXlt", "POSIXt"), tzone = "UTC")
>
> bar <- as.POSIXct(foo, tz = "Europe/Berlin")
>
> bar
> #> [1] "2021-10-01 CEST"
>
> dput(bar)
> structure(1633046400, class = c("POSIXct", "POSIXt"), tzone = "Europe/Berlin")
> ```
>
> --
> Alexandre Courtiol, www.datazoogang.de
>
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
More information about the R-package-devel
mailing list