[R-pkg-devel] Issue handling datetimes: possible differences between computers

Simon Urbanek @|mon@urb@nek @end|ng |rom R-project@org
Mon Oct 10 03:57:06 CEST 2022


Alexandre,

it's better to parse the timestamp in correct timezone:

> foo = as.POSIXlt("2021-10-01", "UTC")
> as.POSIXct(as.character(foo), "Europe/Berlin")
[1] "2021-10-01 CEST"

The issue stems from the fact that you are pretending like your timestamp is UTC (which it is not) while you want to interpret the same values in a different time zone. The DST flags varies depending on the day (due to DST being 0 or 1 depending on the date) and POSIXlt does not have that information since you only attached the time zone without updating it:

> str(unclass(as.POSIXlt(foo, "Europe/Berlin")))
List of 9
 $ sec  : num 0
 $ min  : int 0
 $ hour : int 0
 $ mday : int 1
 $ mon  : int 9
 $ year : int 121
 $ wday : int 5
 $ yday : int 273
 $ isdst: int 0
 - attr(*, "tzone")= chr "Europe/Berlin"

note that isdst is 0 from the UTC entry (which doesn't have DST) even though that date is actually DST in CEST. Compare that to the correctly parsed POSIXlt:

> str(unclass(as.POSIXlt(as.character(foo), "Europe/Berlin")))
List of 11
 $ sec   : num 0
 $ min   : int 0
 $ hour  : int 0
 $ mday  : int 1
 $ mon   : int 9
 $ year  : int 121
 $ wday  : int 5
 $ yday  : int 273
 $ isdst : int 1
 $ zone  : chr "CEST"
 $ gmtoff: int NA
 - attr(*, "tzone")= chr "Europe/Berlin"

where isdst is 1 since it is indeed the DST. The OS difference seems to be that Linux respects the isdst information from POSIXlt while Windows and macOS ignores it. This behavior is documented: 

     At all other times ‘isdst’ can be deduced from the
     first six values, but the behaviour if it is set incorrectly is
     platform-dependent.

You can re-set isdst to -1 to make sure R will try to determine it:

> foo$isdst = -1L
> as.POSIXct(foo, "Europe/Berlin")
[1] "2021-10-01 CEST"

So, generally, you cannot simply change the time zone in POSIXlt - don't pretend the time is in UTC if it's not, you have to re-parse or re-compute the timestamps for it to be reliable or else the DST flag will be wrong.

Cheers,
Simon


> On 10/10/2022, at 1:14 AM, Alexandre Courtiol <alexandre.courtiol using gmail.com> wrote:
> 
> Hi R pkg developers,
> 
> We are facing a datetime handling issue which manifests itself in a
> package we are working on.
> 
> In context, we noticed that reading datetime info from an excel file
> resulted in different data depending on the computer we used.
> 
> We are aware that timezone and regional settings are general sources
> of troubles, but the code we are using was trying to circumvent this.
> We went only as far as figuring out that the issue happens when
> converting a POSIXlt into a POSIXct.
> 
> Please find below, a minimal reproducible example where `foo` is
> converted to `bar` on two different computers.
> `foo` is a POSIXlt with a defined time zone and upon conversion to a
> POSIXct, despite using a set time zone, we end up with `bar` being
> different on Linux and on a Windows machine.
> 
> We noticed that the difference emerges from the system call
> `.Internal(as.POSIXct())` within `as.POSIXct.POSIXlt()`.
> We also noticed that the internal function in R actually calls
> getenv("TZ") within C, which is probably what explains where the
> difference comes from.
> 
> Such a behaviour is probably expected and not a bug, but what would be
> the strategy to convert a POSIXlt into a POSIXct that would not be
> machine dependent?
> 
> We finally noticed that depending on the datetime used as a starting
> point and on the time zone used when calling `as.POSIXct()`, we
> sometimes have a difference between computers and sometimes not...
> which adds to our puzzlement.
> 
> Many thanks.
> Alex & Liam
> 
> 
> ``` r
> ## On Linux
> foo <- structure(list(sec = 0, min = 0L, hour = 0L, mday = 1L, mon =
> 9L, year = 121L, wday = 5L, yday = 273L, isdst = 0L),
>                 class = c("POSIXlt", "POSIXt"), tzone = "UTC")
> 
> bar <- as.POSIXct(foo, tz = "Europe/Berlin")
> 
> bar
> #> [1] "2021-10-01 01:00:00 CEST"
> 
> dput(bar)
> #> structure(1633042800, class = c("POSIXct", "POSIXt"), tzone =
> "Europe/Berlin")
> ```
> 
> ``` r
> ## On Windows
> foo <- structure(list(sec = 0, min = 0L, hour = 0L, mday = 1L, mon =
> 9L, year = 121L, wday = 5L, yday = 273L, isdst = 0L),
>                 class = c("POSIXlt", "POSIXt"), tzone = "UTC")
> 
> bar <- as.POSIXct(foo, tz = "Europe/Berlin")
> 
> bar
> #> [1] "2021-10-01 CEST"
> 
> dput(bar)
> structure(1633046400, class = c("POSIXct", "POSIXt"), tzone = "Europe/Berlin")
> ```
> 
> -- 
> Alexandre Courtiol, www.datazoogang.de
> 
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> 



More information about the R-package-devel mailing list