[R] Time zones in POSIClt objects

Jan van der Laan rhe|p @end|ng |rom eoo@@dd@@n|
Fri Oct 11 13:12:11 CEST 2024


Thanks,

On 10/11/24 09:10, Ivan Krylov wrote:
> В Thu, 10 Oct 2024 17:16:52 +0200
> Jan van der Laan <rhelp using eoos.dds.nl> пишет:
>
>> This is where it is unclear to me what the purpose is of the `zone`
>> element of the POSIXlt object. It does allow for registering a time
>> zone per element. It just seems to be ignored.
> I think that since POSIXlt is an interface to what the C standard calls
> the "broken-down" time (into parts, not in terms of functionality) and
> both the C standard [1] and the POSIX mktime() [2] ignore the
> tm_gmtoff/tm_zone fields (standard C because it's not defined there;
> POSIX because it defers to standard C), these fields exist for
> presentation purposes. They may be populated when constructing the time
> object, but not used for later calculations.
>
> Instead, the standard mktime() always uses the process-global timezone,
> so when R processes POSIXlt values, it has to set the TZ environment
> variable from the 'tzone' attribute, call tzset() to set global state in
> the library, use mktime() to obtain seconds since epoch, and then reset
> everything back.

So that could then indeed be a performance issue. Still, a warning that 
this field is ignored might be nice.

And there are use-cases where it would be nice to have different time 
zones per record. For example, for a dataset with flights, departure and 
arrival times are often given in the local time zone of the airports and 
these local times might be relevant for some analyses. At the same time, 
to calculate, for example, flight durations these local time zones need 
to be handled correctly.

>> As I mentioned, fortunately, I only have local time and GMT and it
>> would be fine to convert them to a single time zone if that is what
>> it takes to work with them in R.
> Since your data looks like the following:
>
> times <- list(
>     year = c(2024L, 2024L),
>     month = c(1L, 1L),
>     day = c(1L, 1L),
>     hour = c(12L, 12L),
>     minutes = c(30L, 30L),
>     seconds = c(0, 0),
>     timezone = c("", "GMT")
> )
>
> how about converting all the times into POSIXct?
>
> do.call(mapply, c(
>   \(year, month, day, hour, minutes, seconds, timezone)
>    "%04d-%02d-%02d %02d:%02d:%02d" |>
>     sprintf(year, month, day, hour, minutes, seconds) |>
>     as.POSIXct(format = "%Y-%m-%d %H:%M:%S", tz = timezone),
>   times
> )) |> do.call(c, args=_)
>
> Its 'tzone' attribute exists mostly for presentation purposes, so even
> if you lose it, the exact point in UTC-relative time is still intact.

Thanks, I had already started to do something similar starting from 
POSIXlt: a funtion that converts to POSIXct using the zone information:

lttoct <- function(x) {
   tzone <- attr(x, "tzone")[1]
   result <- rep(as.POSIXct(0, tz = tzone), length(res))
   zones <- unique(x$zone)
   for (zone in zones) {
     sel <- if (is.na(zone)) is.na(x$zone) else
       x$zone == zone & !is.na(x$zone)
     result[sel] <- as.POSIXct(x[sel], tz = zone)
   }
   result
}

t1 <- as.POSIXlt(c("2023-01-01 12:30", "2024-01-01 12:30"))
t1$zone <- c("", "GMT")

lttoct(t1)
# [1] "2023-01-01 12:30:00 CET" "2024-01-01 13:30:00 CET"
as.POSIXct(t1)
# [1] "2023-01-01 12:30:00 CET" "2024-01-01 12:30:00 CET"



More information about the R-help mailing list