[Rd] Date class shows Inf as NA; this confuses the use of is.na()

Emil Bode emil@bode @ending from d@n@@kn@w@nl
Tue Jun 12 14:00:42 CEST 2018


I agree that calling it invalid is a bit confusing, but I’m not sure what the wording should be, as the problem is that the conversion to POSIXlt is failing.
The best solution would be to extend the whole POSIXlt-class, but that’s too much work.
I’ve done some experiments, and it also seems that the Date class can store larger values than POSIXlt:
> as.Date(8e9, origin='1970-01-01')==as.Date(9e9, origin='1970-01-01')
[1] FALSE
> as.POSIXlt(as.Date(8e9, origin='1970-01-01'))==as.POSIXlt(as.Date(9e9, origin='1970-01-01'))
[1] TRUE
> as.POSIXlt(as.Date(8e9, origin='1970-01-01'))
[1] "-5877641-06-23 UTC"
# Same for 9e9
> as.Date(8e9, origin='1970-01-01')>Sys.Date()
[1] TRUE
> as.POSIXlt(as.Date(8e9, origin='1970-01-01'))>as.POSIXlt(Sys.Date())
[1] FALSE

So the situation as I see it now:

  *   Having an infinite date may convey some information, so we shouldn’t prohibit it anyway
  *   Idem for very large values (positive or negative)
  *   But we should warn users that their dates may not be neatly representable, that there is no way to use the default-print
  *   So for values where the POSIXlt-print fails, I think it’s best to print the numerical value, along with some text warning the user
So I’ve adapted the format-function a bit more, with behaviour below.
The details can be adapted of course, but I feel it’s best to print some variant of as.numeric(x) if as.POSIXlt(x) turns out to be unreliable, and further leave is.na()


format.Date <- function (x, ...)
{
  xx <- format(as.POSIXlt(x), ...)
  names(xx) <- names(x)
  if(any(!is.na(x) & (-719162>as.numeric(x) | as.numeric(x)>2932896))) {
    xx[!is.na(x) & (-719162>as.numeric(x) | as.numeric(x)>2932896)] <-
      paste('Date with numerical value',as.numeric(x[!is.na(x) & (-719162>as.numeric(x) | as.numeric(x)>2932896)]))
    warning('Some dates are not in the interval 01-01-01 and 9999-12-31, showing numerical value.')
  }
  xx
}

With the following results:

> environment(print.Date) <- .GlobalEnv
> as.Date(Inf, origin='1970-01-01')
[1] "Date with numerical value Inf"
Warning message:
In format.Date(x) :
  Some dates are not in the interval 01-01-01 and 9999-12-31, showing numerical value.



From: Gabe Becker <becker.gabe using gene.com>
Date: Monday, 11 June 2018 at 23:59
To: Emil Bode <emil.bode using dans.knaw.nl>
Cc: Joris Meys <jorismeys using gmail.com>, Werner Grundlingh <wgrundlingh using gmail.com>, "macqueen1 using llnl.gov" <macqueen1 using llnl.gov>, r-devel <r-devel using r-project.org>
Subject: Re: [Rd] Date class shows Inf as NA; this confuses the use of is.na()

format.Date <- function (x, ...)
{
  xx <- format(as.POSIXlt(x), ...)
  names(xx) <- names(x)
  xx[is.na<http://is.na>(xx) & !is.na<http://is.na>(x)] <- paste('Invalid date:',as.numeric(x[is.na<http://is.na>(xx) & !is.na<http://is.na>(x)]))
  xx
}

	[[alternative HTML version deleted]]



More information about the R-devel mailing list