[Rd] range() for Date and POSIXct could respect `finite = TRUE`

Martin Maechler m@ech|er @end|ng |rom @t@t@m@th@ethz@ch
Fri May 19 17:30:08 CEST 2023


>>>>> Bill Dunlap 
>>>>>     on Thu, 11 May 2023 10:42:48 -0700 writes:

    >> What do others think?

    > I can imagine a class, "TemperatureKelvins", that wraps a
    > double but would have a range of 0 to Inf or one called
    > "GymnasticsScore" with a range of 0 to 10.  For those
    > sorts of things it would be nice to have a generic that
    > gave the possible min and max for the class instead of one
    > that just said they were -Inf and Inf or not.

    > -Bill

yeah.. I agree that a general concept of such an interval class
is even more flexible and generally useful.
OTOH, people have already introduced such classes where they
were really needed, and here it's really about
*if*
is.finite() and is.infinite() are also available and working
but not always FALSE (which they are for logical, integer,
character *and* raw, the latter really debatable - but *not* in this thread).

So, allows.infinite(x)  would *not* vectorize but return TRUE or
FALSE (and typically not NA ..), in some sense being a property
of class(x) only.


    > On Thu, May 11, 2023 at 1:49 AM Martin Maechler
    > <maechler using stat.math.ethz.ch> wrote:

    >> >>>>> Davis Vaughan
    >> >>>>>     on Tue, 9 May 2023 09:49:41 -0400 writes:
    >> 
    >> > It seems like the main problem is that `is.numeric(x)`
    >> > isn't fully indicative of whether or not `is.finite(x)`
    >> > makes sense for `x` (i.e.  Date isn't numeric but does
    >> > allow infinite dates).
    >> 
    >> > So I could also imagine a new `allows.infinite()` S3
    >> > generic that would return a single TRUE/FALSE for whether
    >> > or not the type allows infinite values, this would also be
    >> > indicative of whether or not `is.finite()` and
    >> > `is.infinite()` make sense on that type. I imagine it
    >> > being used like:

    >> > ```
    >> >   allows.infinite <- function(x) {
    >> >     UseMethod("allows.infinite")
    >> >   }
    >> >   allows.infinite.default <- function(x) {
    >> >     is.numeric(x) # For backwards compatibility, maybe? Not sure.
    >> >   }

it would have to include  is.complex() as well *and*
in principle I'd want to *exclude* integers as they really
cannot be +/- Inf
... but then you did say "not sure" ..

I'm still somewhat favoring this proposal,
because it would be a bit more generally applicable
but still very simple.

Personally, I'd go for the shorter allowsInf()  name,
not adding another  <word1>.<word2>()  generic function,
but that's less important and should not determine decisions I think.

Martin

    >> >   allows.infinite.Date <- function(x) {
    >> >     TRUE
    >> >   }
    >> >   allows.infinite.POSIXct <- function(x) {
    >> >     TRUE
    >> >   }
    >> >
    >> >   range.default <- function (..., na.rm = FALSE, finite = FALSE) {
    >> >     x <- c(..., recursive = TRUE)
    >> >     if (allows.infinite(x)) { # changed from `is.numeric()`
    >> >       if (finite)
    >> >         x <- x[is.finite(x)]
    >> >       else if (na.rm)
    >> >         x <- x[!is.na(x)]
    >> >       c(min(x), max(x))
    >> >     }
    >> >     else {
    >> >       if (finite)
    >> >         na.rm <- TRUE
    >> >       c(min(x, na.rm = na.rm), max(x, na.rm = na.rm))
    >> >     }
    >> >   }
    >> >   ```
    >> 
    >> > It could allow other R developers to also use the pattern of:
    >> 
    >> > ```
    >> > if (allows.infinite(x)) {
    >> >    # conditionally do stuff with is.infinite(x)
    >> > }
    >> > ```
    >> 
    >> > and that seems like it could be rather nice.
    >> 
    >> > It would avoid the need for `range.Date()` and `range.POSIXct()`
    >> methods too.
    >> 
    >> > -Davis
    >> 
    >> That *is* an interesting alternative perspective ...
    >> sent just about before I was going to commit my proposal (incl
    >> new help page entries, regr.tests ..).
    >> 
    >> So we would introduce a new generic  allows.infinite() {or
    >> better name?,  allowsInf, ..} with the defined semantic that
    >> 
    >> allows.infinite(x) for a vector 'x' gives a logical "scalar",
    >> TRUE iff it is known that  is.finite(x) "makes sense" and
    >> returns a logical vector of length length(x) .. which is TRUE
    >> where x[i] is not NA/NaN/+Inf/-Inf .. *and*
    >> is.infinite := Negate(is.finite)    {or vice versa if you prefer}.
    >> 
    >> I agree that this may be useful somewhat more generally than
    >> just for  range() methods.
    >> 
    >> What do others think?
    >> 
    >> Martin
    >> 
    >> 
    >> > On Thu, May 4, 2023 at 5:29 AM Martin Maechler
    >> > <maechler using stat.math.ethz.ch> wrote:
    >> [......]
    >> 
    >> >> >>>>> Davis Vaughan
    >> >> >>>>>     on Mon, 1 May 2023 08:46:33 -0400 writes:
    >> >>
    >> >> > Martin,
    >> >> > Yes, I missed that those have `Summary.*` methods, thanks!
    >> >>
    >> >> > Tweaking those to respect `finite = TRUE` sounds great. It seems
    >> like
    >> >> > it might be a little tricky since the Summary methods call
    >> >> > `NextMethod()`, and `range.default()` uses `is.numeric()` to
    >> determine
    >> >> > whether or not to apply `finite`. Because `is.numeric.Date()` is
    >> >> > defined, that always returns `FALSE` for Dates (and POSIXt).
    >> Because
    >> >> > of that, it may still be easier to just write a specific
    >> >> > `range.Date()` method, but I'm not sure.
    >> >>
    >> >> > -Davis
    >> >>
    >> >> I've looked more closely now, and indeed,
    >> >> range() is the only function in the  Summary  group
    >> >> where (only) the default method has a 'finite' argument.
    >> >> which strikes me as somewhat asymmetric / inconsequential, as
    >> >> after all,  range(.) := c(min(.), max(.)) ,
    >> >> but  min() and max() do not obey an finite=TRUE setting, note
    >> >>
    >> >> > min(c(-Inf,3:5), finite=TRUE)
    >> >> Error: attempt to use zero-length variable name
    >> >>
    >> >> where the error message also is not particularly friendly
    >> >> and of course has nothing to with 'finite' :
    >> >>
    >> >> > max(1:4, foo="bar")
    >> >> Error: attempt to use zero-length variable name
    >> >> >
    >> >>
    >> >> ... but that is diverting;  coming back to the topic:  Given
    >> >> that 'finite' only applies to range() {and there is just a
    >> convenience},
    >> >> I do agree that from my own work & support to make `Date` and
    >> >> `POSIX(c)t` behave more number-like, it would be "nice" to have
    >> >> range() obey a `finite=TRUE` also for these.
    >> >>
    >> >> OTOH, there are quite a few other 'number-like' thingies for
    >> >> which I would then like to have  range(*, finite=TRUE) work,
    >> >> e.g.,  "mpfr" (package {Rmpfr}) or "bigz" {gmp} numbers, numeric
    >> >> sparse matrices, ...
    >> >>
    >> >> To keep such methods all internally consistent with
    >> >> range.default(), I could envision something like this
    >> >>
    >> >>
    >> >> .rangeNum <- function(..., na.rm = FALSE, finite = FALSE, isNumeric)
    >> >> {
    >> >> x <- c(..., recursive = TRUE)
    >> >> if(isNumeric(x)) {
    >> >> if(finite) x <- x[is.finite(x)]
    >> >> else if(na.rm) x <- x[!is.na(x)]
    >> >> c(min(x), max(x))
    >> >> } else {
    >> >> if(finite) na.rm <- TRUE
    >> >> c(min(x, na.rm=na.rm), max(x, na.rm=na.rm))
    >> >> }
    >> >> }
    >> >>
    >> >> range.default <- function(..., na.rm = FALSE, finite = FALSE)
    >> >> .rangeNum(..., na.rm=na.rm, finite=finite, isNumeric = is.numeric)
    >> >>
    >> >> range.POSIXct <- range.Date <- function(..., na.rm = FALSE, finite
    >> = FALSE)
    >> >> .rangeNum(..., na.rm=na.rm, finite=finite, isNumeric =
    >> function(.)TRUE)
    >> >>
    >> >>
    >> >>
    >> >> which would also provide .rangeNum() to be used by implementors
    >> >> of other numeric-like classes to provide their own range()
    >> >> method as a 1-liner *and* be future-consistent with the default
    >> method..
    >> >>
    >> >>
    >> >>
    >> >>
    >> >> > On Sat, Apr 29, 2023 at 4:47 PM Martin Maechler
    >> >> > <maechler using stat.math.ethz.ch> wrote:
    >> >> >>
    >> >> >> >>>>> Davis Vaughan via R-devel
    >> >> >> >>>>>     on Fri, 28 Apr 2023 11:12:27 -0400 writes:
    >> >> >>
    >> >> >> > Hi all,
    >> >> >>
    >> >> >> > I noticed that `range.default()` has a nice `finite =
    >> >> >> > TRUE` argument, but it doesn't actually apply to Date or
    >> >> >> > POSIXct due to how `is.numeric()` works.
    >> >> >>
    >> >> >> Well, I think it would / should never apply:
    >> >> >>
    >> >> >> range() belongs to the "Summary" group generics (as min, max,
    >> ...)
    >> >> >>
    >> >> >> and there  *are*  Summary.Date()  and Summary.POSIX{c,l}t()
    >> methods.
    >> >> >>
    >> >> >> Without checking further for now, I think you are indirectly
    >> >> >> suggesting to enhance these three Summary.*() methods so they do
    >> >> >> obey  'finite = TRUE' .
    >> >> >>
    >> >> >> I think I agree they should.
    >> >> >>
    >> >> >> Martin
    >> >> >>
    >> >> >> > ``` x <- .Date(c(0, Inf, 1, 2, Inf)) x #> [1] "1970-01-01"
    >> >> >> > "Inf" "1970-01-02" "1970-01-03" "Inf"
    >> >> >>
    >> >> >> > # Darn!  range(x, finite = TRUE) #> [1] "1970-01-01" "Inf"
    >> >> >>
    >> >> >> > # What I want .Date(range(unclass(x), finite = TRUE)) #>
    >> >> >> > [1] "1970-01-01" "1970-01-03" ```
    >> >> >>
    >> >> >> > I think `finite = TRUE` would be pretty nice for Dates in
    >> >> >> > particular.
    >> >> >>
    >> >> >> > As a motivating example, sometimes you have ranges of
    >> >> >> > dates represented by start/end pairs. It is fairly natural
    >> >> >> > to represent an event that hasn't ended yet with an
    >> >> >> > infinite date. If you need to then compute a sequence of
    >> >> >> > dates spanning the full range of the start/end pairs, it
    >> >> >> > would be nice to be able to use `range(finite = TRUE)` to
    >> >> >> > do so:
    >> >> >>
    >> >> >> > ``` start <- as.Date(c("2019-01-05", "2019-01-10",
    >> >> >> > "2019-01-11", "2019-01-14")) end <-
    >> >> >> > as.Date(c("2019-01-07", NA, "2019-01-14", NA))
    >> >> >> > end[is.na(end)] <- Inf
    >> >> >>
    >> >> >> > # `end = Inf` means that the event hasn't "ended" yet
    >> >> >> > data.frame(start, end) #> start end #> 1 2019-01-05
    >> >> >> > 2019-01-07 #> 2 2019-01-10 Inf #> 3 2019-01-11 2019-01-14
    >> >> >> > #> 4 2019-01-14 Inf
    >> >> >>
    >> >> >> > # Create a full sequence along all days in start/end range
    >> >> >> > <- .Date(range(unclass(c(start, end)), finite = TRUE))
    >> >> >> > seq(range[1], range[2], by = 1) #> [1] "2019-01-05"
    >> >> >> > "2019-01-06" "2019-01-07" "2019-01-08" "2019-01-09" #> [6]
    >> >> >> > "2019-01-10" "2019-01-11" "2019-01-12" "2019-01-13"
    >> >> >> > "2019-01-14" ```
    >> >> >>
    >> >> >> > It seems like one option is to create a `range.Date()`
    >> >> >> > method that unclasses, forwards the arguments on to a
    >> >> >> > second call to `range()`, and then reclasses?
    >> >> >>
    >> >> >> > ``` range.Date <- function(x, ..., na.rm = FALSE, finite =
    >> >> >> > FALSE) { .Date(range(unclass(x), na.rm = na.rm, finite =
    >> >> >> > finite), oldClass(x)) } ```
    >> >> >>
    >> >> >> > This is similar to how `rep.Date()` works.
    >> >> >>
    >> >> >> > Thanks, Davis Vaughan
    >> >> >>
    >> >> >> > ______________________________________________
    >> >> >> > R-devel using r-project.org mailing list
    >> >> >> > https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 
    >> ______________________________________________
    >> R-devel using r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 

    > [[alternative HTML version deleted]]



More information about the R-devel mailing list