[Rd] Invalid date-times and as.POSIXct problems (remotely related to DST issues)
Karl Ove Hufthammer
karl at huftis.org
Mon Mar 12 15:29:14 CET 2012
I think this should be handled as a bug, but I’m not sure which
platforms and versions it applies to, so I’m writing to this list. The
problem is that as.POSIXct on character strings behaves in a strange way
if one of the date-times are invalid; it converts all the date-times to
dates (i.e., it discards the time part).
Example, which I suspect only works on my locale, with the UTC+1/UTC+2
timezone:
$ dates=c("2003-10-13 00:15:00", "2008-06-03 14:45:00", "2003-03-30 02:00:00")
Note that the last date-time doesn’t actually exist
(due to daylight saving time):
http://www.timeanddate.com/worldclock/meetingtime.html?day=30&month=3&year=2003&p1=187&iv=0
$ d12=as.POSIXct(dates)
$ d123=as.POSIXct(dates[1:2])
$ d12
[1] "2003-10-13 CEST" "2008-06-03 CEST" "2003-03-30 CET"
$ d123
[1] "2003-10-13 00:15:00 CEST" "2008-06-03 14:45:00 CEST"
When I include all values, they are all converted to (POSIXct) *dates*,
but if I exclude the invalid one, the rest are properly converted to
(POSIXct) date-times. Note that this is not just a display issue:
$ unclass(d12)
[1] 1065996000 1212444000 1048978800
attr(,"tzone")
[1] ""
$ unclass(d123)
[1] 1065996900 1212497100
attr(,"tzone")
[1] ""
I can only reproduce this on Windows; on Linux all the strings are
converted to date-times (the last one to 2003-03-30 01:00:00 CET).
However, if ones specifies a completely invalid time, e.g., 25:00, the
same thing does happen on Linux (2.14.2 Patched). I think the right/best
behaviour would be to convert the invalid date-time string to NA and
convert the other ones proper POSIXct date-times, and perhaps issue a
warning about NAs being generated.
(I originally discovered this problem on data from an Oracle database,
using sqlQuery() from the RODBC package, which automatically converts
date-times to date-times in current timezone (except if you specify
as.is=TRUE), and was surprised that for some queries the date-times were
truncated to dates. A warning that parts of the data were invalid would
be very welcome.)
Version details (for Windows):
$ version
_
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 2
minor 14.2
year 2012
month 02
day 29
svn rev 58522
language R
version.string R version 2.14.2 (2012-02-29)
$ sessionInfo()
R version 2.14.2 (2012-02-29)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=Norwegian-Nynorsk_Norway.1252
LC_CTYPE=Norwegian-Nynorsk_Norway.1252
LC_MONETARY=Norwegian-Nynorsk_Norway.1252
[4] LC_NUMERIC=C
LC_TIME=Norwegian-Nynorsk_Norway.1252
attached base packages:
[1] stats graphics grDevices datasets utils methods base
--
Karl Ove Hufthammer
More information about the R-devel
mailing list