[Rd] Another issue with Sys.timezone
Martin Maechler
maechler at stat.math.ethz.ch
Fri Oct 20 09:15:42 CEST 2017
>>>>> Stephen Berman <stephen.berman at gmx.net>
>>>>> on Thu, 19 Oct 2017 17:12:50 +0200 writes:
> On Wed, 18 Oct 2017 18:09:41 +0200 Martin Maechler <maechler at stat.math.ethz.ch> wrote:
>>>>>>> Martin Maechler <maechler at stat.math.ethz.ch>
>>>>>>> on Mon, 16 Oct 2017 19:13:31 +0200 writes:
> (I also included a reply to part of this response of yours below.)
>>>>>>> Stephen Berman <stephen.berman at gmx.net>
>>>>>>> on Sun, 15 Oct 2017 01:53:12 +0200 writes:
>>
>>> > (I reported the test failure mentioned below to R-help but was advised
>>> > that this list is the right one to address the issue; in the meantime I
>>> > investigated the matter somewhat more closely, including searching
>>> > recent R-devel postings, since I haven't been following this list.)
>>>
>>> > Last May there were two reports here of problems with Sys.timezone, one
>>> > where the zoneinfo directory is in a nonstandard location
>>> > (https://stat.ethz.ch/pipermail/r-devel/2017-May/074267.html) and the
>>> > other where the system lacks the file /etc/localtime
>>> > (https://stat.ethz.ch/pipermail/r-devel/2017-May/074275.html). My
>>> > system exhibits a third case: it lacks /etc/timezone and does not set TZ
>>> > systemwide, but it does have /etc/localtime, which is a copy of, rather
>>> > than a symlink to, a file under zoneinfo. On this system Sys.timezone()
>>> > returns NA and the Sys.timezone test in reg-tests-1d fails. However, on
>>> > my system I can get the (abbreviated) timezone in R by using as.POSIXlt,
>>> > e.g. as.POSIXlt(Sys.time())$zone. If Sys.timezone took advantage of
>>> > this, e.g. as below, it would be useful on such systems as mine and the
>>> > regression test would pass.
>>>
>>> > my.Sys.timezone <-
>>> > function (location = TRUE)
>>> > {
>>> > tz <- Sys.getenv("TZ", names = FALSE)
>>> > if (!location || nzchar(tz))
>>> > return(Sys.getenv("TZ", unset = NA_character_))
>>> > lt <- normalizePath("/etc/localtime")
>>> > if (grepl(pat <- "^/usr/share/zoneinfo/", lt) ||
>>> > grepl(pat <- "^/usr/share/zoneinfo.default/", lt))
>>> > sub(pat, "", lt)
>>> > else if (lt == "/etc/localtime")
>>> > if (!file.exists("/etc/timezone"))
>>> > return(as.POSIXlt(Sys.time())$zone)
>>> > else if (dir.exists("/usr/share/zoneinfo") && {
>>> > info <- file.info(normalizePath("/etc/timezone"), extra_cols = FALSE)
>>> > (!info$isdir && info$size <= 200L)
>>> > } && {
>>> > tz1 <- tryCatch(readBin("/etc/timezone", "raw", 200L),
>>> > error = function(e) raw(0L))
>>> > length(tz1) > 0L && all(tz1 %in% as.raw(c(9:10, 13L, 32:126)))
>>> > } && {
>>> > tz2 <- gsub("^[[:space:]]+|[[:space:]]+$", "", rawToChar(tz1))
>>> > tzp <- file.path("/usr/share/zoneinfo", tz2)
>>> > file.exists(tzp) && !dir.exists(tzp) &&
>>> > identical(file.size(normalizePath(tzp)), file.size(lt))
>>> > })
>>> > tz2
>>> > else NA_character_
>>> > }
>>>
>>> > One problem with this is that the zone component of as.POSIXlt only
>>> > holds the abbreviated timezone, not the Olson name.
>>>
>>> Yes, indeed. So, really only for Sys.timezone(location = FALSE) this
>>> should be given, for the default location = TRUE it should
>>> still give NA (i.e. NA_character_) in your setup.
>>>
>>> Interestingly, the Windows versions of Sys.timezone(location =
>>> FALSE) uses something like your proposal, and I tend to think that
>>> -- again only for location=FALSE -- this should be used on
>>> on-Windows as well, at least instead of returning NA then.
>>>
>>> Also for me on 3 different Linuxen (Fedora 24, F. 26, and ubuntu
>>> 14.04 LTS), I get
>>>
>>> > Sys.timezone()
>>> [1] "Europe/Zurich"
>>> > Sys.timezone(FALSE)
>>> [1] NA
>>> >
>>>
>>> whereas on Windows I get Europe/Berlin for the first (why on
>>> earth - I'm really in Zurich) and get "CEST" ("Central European Summer Time")
>>> for the 2nd one instead of NA ... simply using a smarter version
>>> of your proposal. The windows source is
>>> in R's source at src/library/base/R/windows/system.R :
>>>
>>> Sys.timezone <- function(location = TRUE)
>>> {
>>> tz <- Sys.getenv("TZ", names = FALSE)
>>> if(nzchar(tz)) return(tz)
>>> if(location) return(.Internal(tzone_name()))
>>> z <- as.POSIXlt(Sys.time())
>>> zz <- attr(z, "tzone")
>>> if(length(zz) == 3L) zz[2L + z$isdst] else zz[1L]
>>> }
>>>
>>> >From what I read, the last three lines also work in your setup
>>> where it seems zz would be of length 1, right ?
> Those line do indeed work here, but zz has three elements:
>> attributes(as.POSIXlt(Sys.time()))$tzone
> [1] "" "CET" "CEST"
{ "but" ?? yes, three elements is what I see too, but for that
reason there's the if(length(zz) == 3L) ... }
>>> I'd really propose to use these 3 lines in the non-Windows
>>> version of Sys.timezone .. at the end *instead* of NA_character_
>>> (or a slightly safer version which gives NA_character_ if zz is
>>> of length 0 {e.g. if there is no "tzone" attribute}.
>>>
>>> > i don't know how to
>>> > get the Olson name using only R functions, but maybe it would be good
>>> > enough to return the abbreviated timezone where possible, e.g. as above.
>>> > (On my system I can get the Olson name of the timezone in R with a shell
>>> > pipeline, e.g.: system("find /usr/share/zoneinfo/ -type f | xargs md5sum
>>> > | grep $(md5sum /etc/localtime | cut -d ' ' -f 1) | head -n 1 | cut -d
>>> > '/' -f 5,6"), but the last part of this is tailored to my configuration
>>> > and the whole thing is not OS-neutral, so it isn't suitable for
>>> > Sys.timezone.)
>>>
>>> > Steve Berman
>>>
>>> Definitely not. I still recommend you think of a more portable
>>> solution for the `location = TRUE` (default) case in Sys.timezone().
>>> Returning the non-location form (e.g "CEST") when something like
>>> "Europe/Zurich" is expected is really not a good idea,
>>> and you are lucky that the regression test passes "accidentally" ...
>>>
>>> Martin
>>
>> In the mean time, I have committed a common version (Windows and
>> non-Windows) of Sys.timezone() to the R development sources
>> (aka "R-devel").
>>
>> That now uses as.POSIXlt(Sys.time()) very similarly to the
>> above "Windows only" case, but __only__ for 'location=FALSE'
>> which is not the default.
> Thanks, I think that's definitely better than returning NA when
> `location' is false...
>> The most current development source is always available (via
>> 'svn' or alternatively for browsing via your web browser) from
>>
>> https://svn.r-project.org/R/trunk/src/library/base/R/datetime.R
> ...however, I tried the test that failed for me during `make check' now
> with this new definition of Sys.timezone() by pasting the definition (as
> new.Sys.timezone()) and the two lines of the test code into the R console,
> and this is what happened:
>> new.Sys.timezone()
>> new.Sys.timezone(FALSE)
> [1] "CEST"
>> (S.t <- new.Sys.timezone())
> NULL
>> if(is.na(S.t) || !nzchar(S.t)) stop("could not get timezone")
> Error in if (is.na(S.t) || !nzchar(S.t)) stop("could not get timezone") :
> missing value where TRUE/FALSE needed
> In addition: Warning message:
> In is.na(S.t) : is.na() applied to non-(list or vector) of type 'NULL'
> This is because `location' is true but all the if-clauses in the body
> following `if(location)' are false, so it returns NULL. If you add the
> line `else NA_character_' below the line `tz2', then NA is returned and
> the test fails as before instead of as above.
Thank you, for the perfect diagnosis. Embarrassingly I had
dropped this else-clause accidentally.
>> As you say yourself, the above system("... xargs md5sum ...")
>> using workaround is really too platform specific but I'd guess
>> there should be a less error prone way to get the long timezone
>> name on your system ...
> If I understand the zic(8) man page, the files in /usr/share/zoneinfo
> should contain this information, but I don't know how to extract it,
> since these are compiled files. And since on my system /etc/localtime
> is a copy of one of these compiled files, I don't know of any other way
> to recover the location name without comparing it to those files.
>> If that remains "contained" (i.e. small) and works with files
>> and R's files tools -- e.g. file.*() ones [but not system()],
>> I'd consider a patch to the above source file
>> (sent by you to the R-devel mailing list --- or after having
>> gotten an account there by asking, via bug report & patch
>> attachment at https://bugs.r-project.org/ )
> If comparing file size sufficed, that would be easy to do in R;
> unfortunately, it is not sufficient, since some files designating
> different time zones in /usr/share/zoneinfo do have the same size. So
> the only alternative I can think of is to compare bytes, e.g. with
> md5sum or with cmp. Is there some way to do this in R without using
> system()?
Can't you use
tz1 <- readBin("/etc/localtime", "raw", 200L)
plus later
tz2 <- gsub(......., rawToChar(tz1))
on your /etc/localtime file
almost identically as the current code does for "/etc/timezone" ?
Martin
More information about the R-devel
mailing list