[Rd] as.Date(Inf) displays as 'NA' but is actually 'Inf'

Gabriel Becker g@bembecker @end|ng |rom gm@||@com
Wed Mar 6 07:01:37 CET 2019


On Tue, Mar 5, 2019 at 9:54 PM Richard White <w using rwhite.no> wrote:

> Hi Gabriel,
>
> The point is that it *visually* displays as NA, but is.na() still
> responds as FALSE.
>
> When I (and I am sure many people) see an NA, we then use is.na(). If we
> see Inf displayed, we then use is.infinite(). With as.Date() this breaks
> down.
>
> I'm not arguing that as.Date(Inf) should be coerced to NA. I'm arguing
> that as.Date(Inf) should be *visually* displayed as Inf (i.e. the truth!).
> I doubt this would break any existing code, because as.Date(Inf) acts as
> Inf in every way possible, except for when you visually look at the output
> printed on the screen.
>
> William - For all the other Date bugs, they don't visually display false
> information about the variable's contents. They might give wrong output,
> but the output displayed is what exists inside the variable.
>
> If we can't trust the R console to display the truth, then we are in a lot
> of trouble.
>

Well, I think it (subtly) actually is the truth though. What is displayed
when you print a date is the *formatted date string, not the numeric value
stored within the date*. The formatted date string of the infinite date, is
actually, correctly,  NA, because, for the reasons I pointed out in my last
post, it is indeterminate.

> x = as.Date(Inf, origin = "2018-01-01")

> format(x)

[1] NA


So that is what is happening, both technically, but also conceptually. For
the record, I'd be surprised by that too, but I think its a situation of
pieces working correctly individually, but together having a correct but
unintuitive behavior.

Others may feel differently though, thats just my read on it.

Best,
~G



> > a <- as.Date(Inf, origin="2018-01-01")
> > a
> [1] NA
> > is.na(a)
> [1] FALSE
>
> Richard
>
> Gabriel Becker wrote on 06/03/2019 00:33:
>
> Richard,
>
> Well others may chime in here, but from a mathematical point of view, the
> concept of "infinite days from right now" is well-defined, so it maybe a
> "valid" date in that sense, but what day and month it will be (year will be
> Inf) are indeterminate/not well defined. Those are rightfully, NA, it
> seems?
>
> I mean you could disallow dates to take Inf at all, ever. I don't feel
> strongly one way or the other about that, personally. That said, if inf
> dates are allowed, its not clear to me that displaying the "Formatted" date
> string as NA, even if the value isn't,  is wrong given it can't be
> determined for that "date" is. It could be displayed differently, I
> suppose, but all the ones I can think of off the top of my head would be
> problematic and probably break lots of formatted-dates parsing code out
> there in the wild (and in R, I would guess). Things like displaying
> "Inf-NA-NA", or just "Inf". Neither of those are going to handle a
> read-write round-trip well, I think.
>
> So my personal don't-really-have-a-hat-in-the-ring opinion would be to
> either leave it as is, or force as.Date(Inf, bla) to actually be NA.
>
> Best,
> ~G
>
> On Tue, Mar 5, 2019 at 12:06 PM Richard White <w using rwhite.no> wrote:
>
>> Hi,
>>
>> I think I've discovered a bug in base R.
>>
>> Basically, when using 'Inf' as as 'Date', is is visually displayed as
>> 'NA', but R still treats it as 'Inf'. So it is very confusing to work
>> with, and can easily lead to errors:
>>
>> # Visually displays as NA
>>  > as.Date(Inf, origin="2018-01-01")
>> [1] NA
>>
>> # Visually displays as NA
>>  > str(as.Date(Inf, origin="2018-01-01"))
>> Date[1:1], format: NA
>>
>> # Is NOT NA
>>  > is.na(as.Date(Inf, origin="2018-01-01"))
>> [1] FALSE
>>
>> # Is still Inf
>>  > is.infinite(as.Date(Inf, origin="2018-01-01"))
>> [1] TRUE
>>
>> This gets really problematic when you are collapsing dates over groups
>> and you want to find the first date of a group. Because min() returns
>> Inf if there is no data:
>>
>> # Visually displays as NA
>>  > as.Date(min(), origin="2018-01-01")
>> [1] NA
>> Warning message: In min() : no non-missing arguments to min; returning Inf
>>
>> # Visually displays as NA
>>  > str(as.Date(min(), origin="2018-01-01"))
>> Date[1:1], format: NA
>> Warning message: In min() : no non-missing arguments to min; returning Inf
>>
>> # Is not NA
>>  > is.na(as.Date(min(), origin="2018-01-01"))
>> [1] FALSE
>> Warning message: In min() : no non-missing arguments to min; returning Inf
>>
>> # This is bad!
>>  > as.Date(min(), origin="2018-01-01") > "2018-01-01"
>> [1] TRUE
>> Warning message: In min() : no non-missing arguments to min; returning Inf
>>
>> Here is my sessionInfo():
>>
>>  > sessionInfo()
>> R version 3.5.0 (2018-04-23)
>> Platform: x86_64-pc-linux-gnu (64-bit)
>> Running under: Debian GNU/Linux 9 (stretch)
>> Matrix products: default
>> BLAS: /usr/lib/openblas-base/libblas.so.3
>> LAPACK: /usr/lib/libopenblasp-r0.2.19.so
>>
>> locale:
>> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 LC_COLLATE=C.UTF-8
>> LC_MONETARY=C.UTF-8
>> [6] LC_MESSAGES=C LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] stats graphics grDevices utils datasets methods base loaded via a
>> namespace (and not attached):
>> [1] compiler_3.5.0 tools_3.5.0 yaml_2.1.19
>>
>>  > Sys.getlocale()
>> [1]
>>
>> "LC_CTYPE=C.UTF-8;LC_NUMERIC=C;LC_TIME=C.UTF-8;LC_COLLATE=C.UTF-8;LC_MONETARY=C.UTF-8;LC_MESSAGES=C;LC_PAPER=C.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C.UTF-8;LC_IDENTIFICATION=C"
>>
>> ______________________________________________
>> R-devel using r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>

	[[alternative HTML version deleted]]



More information about the R-devel mailing list