[Rd] performance issue with as.Date
Paul.Ryan at csiro.au
Paul.Ryan at csiro.au
Tue Apr 30 03:51:34 CEST 2013
We encounted a performance problem when a large number of R scripts are run simulatanously. A large number of stat() system calls to /etc/timezone was limiting how many scripts could be run effectively. I traced the problem to as.Date.character where strptime() is called without a timezone argument when there is no format argument.
as.Date.character <- function(x, format="", ...)
{
charToDate <- function(x) {
xx <- x[1L]
if(is.na(xx)) {
j <- 1L
while(is.na(xx) && (j <- j+1L) <= length(x)) xx <- x[j]
if(is.na(xx)) f <- "%Y-%m-%d" # all NAs
}
if(is.na(xx) ||
!is.na(strptime(xx, f <- "%Y-%m-%d", tz="GMT")) ||
!is.na(strptime(xx, f <- "%Y/%m/%d", tz="GMT"))
) return(strptime(x, f))
stop("character string is not in a standard unambiguous format")
}
res <- if(missing(format)) charToDate(x) else strptime(x, format, tz="GMT")
as.Date(res)
}
We could easily workaround this by specifying a format. My question is, should strptime(x, f) have a tz argument as in the case where a format is specified?
Thanks,
Paul
More information about the R-devel
mailing list