[R-pkg-devel] Issue handling datetimes: possible differences between computers
Martin Maechler
m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Tue Oct 11 08:58:01 CEST 2022
>>>>> Ben Bolker
>>>>> on Mon, 10 Oct 2022 16:59:35 -0400 writes:
> Right now as.POSIXlt.Date() is just
> function (x, ...)
> .Internal(Date2POSIXlt(x))
It has been quite a bit different in R-devel for a little
while. NEWS entries (there are more already, and more coming
on the wide topic)
* The as.POSIXlt(<POSIXlt>) and as.POSIXct(<POSIXct>) default
methods now do obey their tz argument, also in this case.
* as.POSIXlt(<Date>) now does apply a tz (timezone) argument, as
does as.POSIXct(); partly suggested by Roland Fuss on the R-devel
mailing list.
and indeed it would have been good had you used (and read) the
R-devel mailing list which is much more appropriate on the
topic of *changing* base R behavior.
> How expensive would it be to throw a warning when '...' is provided by
> the user/discarded ??
> Alternately, perhaps the documentation could be amended, although I'm
> not quite sure what to suggest. (The sentence Liam refers to, "Dates
> without times are treated as being at midnight UTC." is correct but
> terse ...)
> On 2022-10-10 4:50 p.m., Alexandre Courtiol wrote:
>> Hi Simon,
>>
>> Thanks for the clarification.
>>
>> From a naive developer point of view, we were initially baffled that the
>> generic as.POSIXlt() does very different things on a character and on a
>> Date input:
>>
>> as.POSIXlt(as.character(foo), "Europe/Berlin")
>> [1] "1992-09-27 CEST"
>>
>> as.POSIXlt(foo, "Europe/Berlin")
>> [1] "1992-09-27 UTC"
>>
>> Based on what you said, it does make sense: it is only when creating the
>> date/time that we want to include the time zone and that only happens when
>> we don't already work on a previously created date.
>> That is your subtle but spot-on distinction between "parsing" and
>> "changing" the time zone.
>>
>> Yet, we do find it dangerous that as.POSIXlt.Date() accepts a time zone but
>> does nothing of it, especially when the help file starts with:
>>
>> Usage
>> as.POSIXlt(x, tz = "", ...)
>>
>> The behaviour is documented, as Liam reported it, but still, we will almost
>> certainly not be the last one tripping on this (without even adding the
>> additional issue of as.POSIXct() behaving differently across OS).
>>
>> Thanks again,
>>
>> Alex & Liam
>>
>>
>>
>>
>> On Mon, 10 Oct 2022 at 22:13, Simon Urbanek <simon.urbanek using r-project.org>
>> wrote:
>>
>>> Liam,
>>>
>>> I think I have failed to convey my main point in the last e-mail - which
>>> was that you want to parse the date/time in the timezone that you care
>>> about so in your example that would be
>>>
>>>> foo <- as.Date(33874, origin = "1899-12-30")
>>>> foo
>>> [1] "1992-09-27"
>>>> as.POSIXlt(as.character(foo), "Europe/Berlin")
>>> [1] "1992-09-27 CEST"
>>>
>>> I was explicitly saying that you do NOT want to simply change the time
>>> zone on POSIXlt objects as that won't work for reasons I explained - see my
>>> last e-mail.
>>>
>>> Cheers,
>>> Simon
>>>
>>>
>>>> On 11/10/2022, at 6:31 AM, Liam Bailey <liam.bailey using liamdbailey.com>
>>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Thanks Simon for the detailed response, that helps us understand a lot
>>> better what’s going on! However, with your response in mind, we still
>>> encounter some behaviour that we did not expect.
>>>>
>>>> I’ve included another minimum reproducible example below to expand on
>>> the situation. In this example, `foo` is a Date object that we generate
>>> from a numeric input. Following your advice, `bar` is then a POSIXlt object
>>> where we now explicitly define timezone using argument tz. However, even
>>> though we are explicit about the timezone the POSIXlt that is generated is
>>> always in UTC. This then leads to the issues outlined by Alexandre above,
>>> which we now understand are caused by DST.
>>>>
>>>> ``` r
>>>> #Generate date from numeric
>>>> #Not possible to specify tz at this point
>>>> foo <- as.Date(33874, origin = "1899-12-30")
>>>> dput(foo)
>>>> #> structure(8305, class = "Date")
>>>>
>>>> #Convert to POSIXlt specifying UTC timezone
>>>> bar <- as.POSIXlt(foo, tz = "UTC")
>>>> dput(bar)
>>>> #> structure(list(sec = 0, min = 0L, hour = 0L, mday = 27L, mon = 8L,
>>>> #> year = 92L, wday = 0L, yday = 270L, isdst = 0L), class =
>>> c("POSIXlt",
>>>> #> "POSIXt"), tzone = "UTC")
>>>>
>>>> #Convert to POSIXlt specifying Europe/Berlin.
>>>> #Time zone is still UTC
>>>> bar <- as.POSIXlt(foo, tz = "Europe/Berlin")
>>>> dput(bar)
>>>> #> structure(list(sec = 0, min = 0L, hour = 0L, mday = 27L, mon = 8L,
>>>> #> year = 92L, wday = 0L, yday = 270L, isdst = 0L), class =
>>> c("POSIXlt",
>>>> #> "POSIXt"), tzone = "UTC")
>>>> ```
>>>>
>>>>
>>>> We noticed that this occurs because the tz argument is not passed to
>>> `.Internal(Date2POSIXlt())` inside `as.POSIXlt.Date()`.
>>>>
>>>> Reading through the documentation for `as.POSIX*` we can see that this
>>> behaviour is described:
>>>>
>>>> > “Dates without times are treated as being at midnight UTC.”
>>>>
>>>> In this case, if we want to convert a Date object to POSIX* and specify
>>> a (non-UTC) timezone would the best strategy be to first coerce our Date
>>> object to character? Alternatively, `lubridate::as_datetime()` does seem to
>>> recognise the tz argument and convert a Date object to POSIX* with non-UTC
>>> time zone (see second example below). But it would be nice to know if there
>>> are subtle differences between these two approaches that we should be aware
>>> of.
>>>>
>>>> ``` r
>>>> foo <- as.Date(33874, origin = "1899-12-30")
>>>> dput(foo)
>>>> #> structure(8305, class = "Date")
>>>>
>>>> #Convert to POSIXct specifying UTC timezone
>>>> bar <- lubridate::as_datetime(foo, tz = "UTC")
>>>> dput(as.POSIXlt(bar))
>>>> #> structure(list(sec = 0, min = 0L, hour = 0L, mday = 27L, mon = 8L,
>>>> #> year = 92L, wday = 0L, yday = 270L, isdst = 0L), class =
>>> c("POSIXlt",
>>>> #> "POSIXt"), tzone = "UTC")
>>>>
>>>> #Convert to POSIXct specifying Europe/Berlin
>>>> bar <- lubridate::as_datetime(foo, tz = "Europe/Berlin")
>>>> dput(as.POSIXlt(bar))
>>>> #> structure(list(sec = 0, min = 0L, hour = 0L, mday = 27L, mon = 8L,
>>>> #> year = 92L, wday = 0L, yday = 270L, isdst = 1L, zone = "CEST",
>>>> #> gmtoff = 7200L), class = c("POSIXlt", "POSIXt"), tzone =
>>> c("Europe/Berlin",
>>>> #> "CET", "CEST"))
>>>> ```
>>>>
>>>> Thanks again for all your help.
>>>> Alex & Liam
>>>>
>>>>> On 10 Oct 2022, at 6:40 pm, Hadley Wickham <h.wickham using gmail.com> wrote:
>>>>>
>>>>> On Sun, Oct 9, 2022 at 9:31 PM Jeff Newmiller <jdnewmil using dcn.davis.ca.us>
>>> wrote:
>>>>>>
>>>>> ... which is why tidyverse functions and Python datetime handling irk
>>> me so much.
>>>>>>
>>>>> Is tidyverse time handling intrinsically broken? They have a standard
>>> practice of reading time as UTC and then using force_tz to fix the
>>> "mistake". Same as Python.
>>>>>
>>>>> Can you point to any docs that lead you to this conclusion so we can
>>>>> get them fixed? I strongly encourage people to parse date-times in the
>>>>> correct time zone; this is why lubridate::ymd_hms() and friends have a
>>>>> tz argument.
>>>>>
>>>>> Hadley
>>>>>
>>>>> --
>>>>> http://hadley.nz
>>>>
>>>
>>>
>>
> --
> Dr. Benjamin Bolker
> Professor, Mathematics & Statistics and Biology, McMaster University
> Director, School of Computational Science and Engineering
> (Acting) Graduate chair, Mathematics & Statistics
>> E-mail is sent at my convenience; I don't expect replies outside of
> working hours.
> ______________________________________________
> R-package-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
More information about the R-package-devel
mailing list