[Rd] range() for Date and POSIXct could respect `finite = TRUE`
Gabriel Becker
g@bembecker @end|ng |rom gm@||@com
Fri May 19 18:23:50 CEST 2023
Hi All,
I think there may be some possible confusion about what allowsInf would be
reporting (or maybe its just me :) ) if we did this.
Consider a class "myclass", S3, for starters,
with
setMethod("allowsInf", "myclass", function(obj) FALSE)
Then, what would
myclassthing <- structure(1.5, class = "mything")
myclassthing[1] <- Inf
do. Assumely it would happily complete without complaint, right, even
though allowInf(myclassthing) would return FALSE? Thus an infinite value
was allowed. This seems very misleading/counter-intuitive to me.
Perhaps this is just an issue with the proposed naming, though I'm not
certain that's the case.
I guess what i'm saying is allowsInf at its core is a validation criterion
for objects of a particular class, and under that paradigm, having that
validation not be enforced (Which it would not be, at least for S3 classed
objects, I imagine) seems like it would muddy the waters further rather
than making things clearer.
Put another way, and as pointed out by Bill above, the result of allowsInf
is really an attribute of a *class*, not of an object. allowsInf(x) is
really just a proxy for allowsInf(class(x)), right? The problem here is
that S3 doesn't *have* classes in a sense that makes the latter coherent.
Its notable here that developers could also get around this by implementing
methods for the summary group generic that either implement the finite
argument or not as appropriate for their class, right? And that would be
true whether the default for, e.g., min and max were altered to have the
finite argument to match range, or not.
Best,
~G
On Fri, May 19, 2023 at 8:30 AM Martin Maechler <maechler using stat.math.ethz.ch>
wrote:
> >>>>> Bill Dunlap
> >>>>> on Thu, 11 May 2023 10:42:48 -0700 writes:
>
> >> What do others think?
>
> > I can imagine a class, "TemperatureKelvins", that wraps a
> > double but would have a range of 0 to Inf or one called
> > "GymnasticsScore" with a range of 0 to 10. For those
> > sorts of things it would be nice to have a generic that
> > gave the possible min and max for the class instead of one
> > that just said they were -Inf and Inf or not.
>
> > -Bill
>
> yeah.. I agree that a general concept of such an interval class
> is even more flexible and generally useful.
> OTOH, people have already introduced such classes where they
> were really needed, and here it's really about
> *if*
> is.finite() and is.infinite() are also available and working
> but not always FALSE (which they are for logical, integer,
> character *and* raw, the latter really debatable - but *not* in this
> thread).
>
> So, allows.infinite(x) would *not* vectorize but return TRUE or
> FALSE (and typically not NA ..), in some sense being a property
> of class(x) only.
>
>
> > On Thu, May 11, 2023 at 1:49 AM Martin Maechler
> > <maechler using stat.math.ethz.ch> wrote:
>
> >> >>>>> Davis Vaughan
> >> >>>>> on Tue, 9 May 2023 09:49:41 -0400 writes:
> >>
> >> > It seems like the main problem is that `is.numeric(x)`
> >> > isn't fully indicative of whether or not `is.finite(x)`
> >> > makes sense for `x` (i.e. Date isn't numeric but does
> >> > allow infinite dates).
> >>
> >> > So I could also imagine a new `allows.infinite()` S3
> >> > generic that would return a single TRUE/FALSE for whether
> >> > or not the type allows infinite values, this would also be
> >> > indicative of whether or not `is.finite()` and
> >> > `is.infinite()` make sense on that type. I imagine it
> >> > being used like:
>
> >> > ```
> >> > allows.infinite <- function(x) {
> >> > UseMethod("allows.infinite")
> >> > }
> >> > allows.infinite.default <- function(x) {
> >> > is.numeric(x) # For backwards compatibility, maybe? Not sure.
> >> > }
>
> it would have to include is.complex() as well *and*
> in principle I'd want to *exclude* integers as they really
> cannot be +/- Inf
> ... but then you did say "not sure" ..
>
> I'm still somewhat favoring this proposal,
> because it would be a bit more generally applicable
> but still very simple.
>
> Personally, I'd go for the shorter allowsInf() name,
> not adding another <word1>.<word2>() generic function,
> but that's less important and should not determine decisions I think.
>
> Martin
>
> >> > allows.infinite.Date <- function(x) {
> >> > TRUE
> >> > }
> >> > allows.infinite.POSIXct <- function(x) {
> >> > TRUE
> >> > }
> >> >
> >> > range.default <- function (..., na.rm = FALSE, finite = FALSE) {
> >> > x <- c(..., recursive = TRUE)
> >> > if (allows.infinite(x)) { # changed from `is.numeric()`
> >> > if (finite)
> >> > x <- x[is.finite(x)]
> >> > else if (na.rm)
> >> > x <- x[!is.na(x)]
> >> > c(min(x), max(x))
> >> > }
> >> > else {
> >> > if (finite)
> >> > na.rm <- TRUE
> >> > c(min(x, na.rm = na.rm), max(x, na.rm = na.rm))
> >> > }
> >> > }
> >> > ```
> >>
> >> > It could allow other R developers to also use the pattern of:
> >>
> >> > ```
> >> > if (allows.infinite(x)) {
> >> > # conditionally do stuff with is.infinite(x)
> >> > }
> >> > ```
> >>
> >> > and that seems like it could be rather nice.
> >>
> >> > It would avoid the need for `range.Date()` and `range.POSIXct()`
> >> methods too.
> >>
> >> > -Davis
> >>
> >> That *is* an interesting alternative perspective ...
> >> sent just about before I was going to commit my proposal (incl
> >> new help page entries, regr.tests ..).
> >>
> >> So we would introduce a new generic allows.infinite() {or
> >> better name?, allowsInf, ..} with the defined semantic that
> >>
> >> allows.infinite(x) for a vector 'x' gives a logical "scalar",
> >> TRUE iff it is known that is.finite(x) "makes sense" and
> >> returns a logical vector of length length(x) .. which is TRUE
> >> where x[i] is not NA/NaN/+Inf/-Inf .. *and*
> >> is.infinite := Negate(is.finite) {or vice versa if you prefer}.
> >>
> >> I agree that this may be useful somewhat more generally than
> >> just for range() methods.
> >>
> >> What do others think?
> >>
> >> Martin
> >>
> >>
> >> > On Thu, May 4, 2023 at 5:29 AM Martin Maechler
> >> > <maechler using stat.math.ethz.ch> wrote:
> >> [......]
> >>
> >> >> >>>>> Davis Vaughan
> >> >> >>>>> on Mon, 1 May 2023 08:46:33 -0400 writes:
> >> >>
> >> >> > Martin,
> >> >> > Yes, I missed that those have `Summary.*` methods, thanks!
> >> >>
> >> >> > Tweaking those to respect `finite = TRUE` sounds great. It
> seems
> >> like
> >> >> > it might be a little tricky since the Summary methods call
> >> >> > `NextMethod()`, and `range.default()` uses `is.numeric()` to
> >> determine
> >> >> > whether or not to apply `finite`. Because `is.numeric.Date()`
> is
> >> >> > defined, that always returns `FALSE` for Dates (and POSIXt).
> >> Because
> >> >> > of that, it may still be easier to just write a specific
> >> >> > `range.Date()` method, but I'm not sure.
> >> >>
> >> >> > -Davis
> >> >>
> >> >> I've looked more closely now, and indeed,
> >> >> range() is the only function in the Summary group
> >> >> where (only) the default method has a 'finite' argument.
> >> >> which strikes me as somewhat asymmetric / inconsequential, as
> >> >> after all, range(.) := c(min(.), max(.)) ,
> >> >> but min() and max() do not obey an finite=TRUE setting, note
> >> >>
> >> >> > min(c(-Inf,3:5), finite=TRUE)
> >> >> Error: attempt to use zero-length variable name
> >> >>
> >> >> where the error message also is not particularly friendly
> >> >> and of course has nothing to with 'finite' :
> >> >>
> >> >> > max(1:4, foo="bar")
> >> >> Error: attempt to use zero-length variable name
> >> >> >
> >> >>
> >> >> ... but that is diverting; coming back to the topic: Given
> >> >> that 'finite' only applies to range() {and there is just a
> >> convenience},
> >> >> I do agree that from my own work & support to make `Date` and
> >> >> `POSIX(c)t` behave more number-like, it would be "nice" to have
> >> >> range() obey a `finite=TRUE` also for these.
> >> >>
> >> >> OTOH, there are quite a few other 'number-like' thingies for
> >> >> which I would then like to have range(*, finite=TRUE) work,
> >> >> e.g., "mpfr" (package {Rmpfr}) or "bigz" {gmp} numbers, numeric
> >> >> sparse matrices, ...
> >> >>
> >> >> To keep such methods all internally consistent with
> >> >> range.default(), I could envision something like this
> >> >>
> >> >>
> >> >> .rangeNum <- function(..., na.rm = FALSE, finite = FALSE,
> isNumeric)
> >> >> {
> >> >> x <- c(..., recursive = TRUE)
> >> >> if(isNumeric(x)) {
> >> >> if(finite) x <- x[is.finite(x)]
> >> >> else if(na.rm) x <- x[!is.na(x)]
> >> >> c(min(x), max(x))
> >> >> } else {
> >> >> if(finite) na.rm <- TRUE
> >> >> c(min(x, na.rm=na.rm), max(x, na.rm=na.rm))
> >> >> }
> >> >> }
> >> >>
> >> >> range.default <- function(..., na.rm = FALSE, finite = FALSE)
> >> >> .rangeNum(..., na.rm=na.rm, finite=finite, isNumeric =
> is.numeric)
> >> >>
> >> >> range.POSIXct <- range.Date <- function(..., na.rm = FALSE,
> finite
> >> = FALSE)
> >> >> .rangeNum(..., na.rm=na.rm, finite=finite, isNumeric =
> >> function(.)TRUE)
> >> >>
> >> >>
> >> >>
> >> >> which would also provide .rangeNum() to be used by implementors
> >> >> of other numeric-like classes to provide their own range()
> >> >> method as a 1-liner *and* be future-consistent with the default
> >> method..
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> > On Sat, Apr 29, 2023 at 4:47 PM Martin Maechler
> >> >> > <maechler using stat.math.ethz.ch> wrote:
> >> >> >>
> >> >> >> >>>>> Davis Vaughan via R-devel
> >> >> >> >>>>> on Fri, 28 Apr 2023 11:12:27 -0400 writes:
> >> >> >>
> >> >> >> > Hi all,
> >> >> >>
> >> >> >> > I noticed that `range.default()` has a nice `finite =
> >> >> >> > TRUE` argument, but it doesn't actually apply to Date or
> >> >> >> > POSIXct due to how `is.numeric()` works.
> >> >> >>
> >> >> >> Well, I think it would / should never apply:
> >> >> >>
> >> >> >> range() belongs to the "Summary" group generics (as min, max,
> >> ...)
> >> >> >>
> >> >> >> and there *are* Summary.Date() and Summary.POSIX{c,l}t()
> >> methods.
> >> >> >>
> >> >> >> Without checking further for now, I think you are indirectly
> >> >> >> suggesting to enhance these three Summary.*() methods so they
> do
> >> >> >> obey 'finite = TRUE' .
> >> >> >>
> >> >> >> I think I agree they should.
> >> >> >>
> >> >> >> Martin
> >> >> >>
> >> >> >> > ``` x <- .Date(c(0, Inf, 1, 2, Inf)) x #> [1] "1970-01-01"
> >> >> >> > "Inf" "1970-01-02" "1970-01-03" "Inf"
> >> >> >>
> >> >> >> > # Darn! range(x, finite = TRUE) #> [1] "1970-01-01" "Inf"
> >> >> >>
> >> >> >> > # What I want .Date(range(unclass(x), finite = TRUE)) #>
> >> >> >> > [1] "1970-01-01" "1970-01-03" ```
> >> >> >>
> >> >> >> > I think `finite = TRUE` would be pretty nice for Dates in
> >> >> >> > particular.
> >> >> >>
> >> >> >> > As a motivating example, sometimes you have ranges of
> >> >> >> > dates represented by start/end pairs. It is fairly natural
> >> >> >> > to represent an event that hasn't ended yet with an
> >> >> >> > infinite date. If you need to then compute a sequence of
> >> >> >> > dates spanning the full range of the start/end pairs, it
> >> >> >> > would be nice to be able to use `range(finite = TRUE)` to
> >> >> >> > do so:
> >> >> >>
> >> >> >> > ``` start <- as.Date(c("2019-01-05", "2019-01-10",
> >> >> >> > "2019-01-11", "2019-01-14")) end <-
> >> >> >> > as.Date(c("2019-01-07", NA, "2019-01-14", NA))
> >> >> >> > end[is.na(end)] <- Inf
> >> >> >>
> >> >> >> > # `end = Inf` means that the event hasn't "ended" yet
> >> >> >> > data.frame(start, end) #> start end #> 1 2019-01-05
> >> >> >> > 2019-01-07 #> 2 2019-01-10 Inf #> 3 2019-01-11 2019-01-14
> >> >> >> > #> 4 2019-01-14 Inf
> >> >> >>
> >> >> >> > # Create a full sequence along all days in start/end range
> >> >> >> > <- .Date(range(unclass(c(start, end)), finite = TRUE))
> >> >> >> > seq(range[1], range[2], by = 1) #> [1] "2019-01-05"
> >> >> >> > "2019-01-06" "2019-01-07" "2019-01-08" "2019-01-09" #> [6]
> >> >> >> > "2019-01-10" "2019-01-11" "2019-01-12" "2019-01-13"
> >> >> >> > "2019-01-14" ```
> >> >> >>
> >> >> >> > It seems like one option is to create a `range.Date()`
> >> >> >> > method that unclasses, forwards the arguments on to a
> >> >> >> > second call to `range()`, and then reclasses?
> >> >> >>
> >> >> >> > ``` range.Date <- function(x, ..., na.rm = FALSE, finite =
> >> >> >> > FALSE) { .Date(range(unclass(x), na.rm = na.rm, finite =
> >> >> >> > finite), oldClass(x)) } ```
> >> >> >>
> >> >> >> > This is similar to how `rep.Date()` works.
> >> >> >>
> >> >> >> > Thanks, Davis Vaughan
> >> >> >>
> >> >> >> > ______________________________________________
> >> >> >> > R-devel using r-project.org mailing list
> >> >> >> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >> ______________________________________________
> >> R-devel using r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
>
> > [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]
More information about the R-devel
mailing list