[Rd] as.Date nuance
Vladimir Dergachev
vdergachev at rcgardis.com
Mon Mar 26 23:08:18 CEST 2007
On Saturday 24 March 2007 12:12 pm, Gabor Grothendieck wrote:
> It matches in the sense of grep or regexpr
>
> grep("a", "ab") > 0
> regexpr("a", "ab") > 0
>
> Try this:
>
> x <- c("2006-01-01error", "2006-01-01")
> as.Date(x, "%Y-%m-%d") + ifelse(regexpr("^....-..-..$", x) > 0, 0, NA)
>
Well, still I would have expected as.Date() to do the same thing as.integer()
or as.numeric() do - return NA and produce a warning.
After poking in the code I also noticed that the format guess is done using
the first element only:
> as.Date(c("2006", "2006-01-01"))
Error in fromchar(x) : character string is not in a standard unambiguous
format
> as.Date(c("2006-01-01", "2006"))
[1] "2006-01-01" NA
I attached a patch that changes do_strptime to behave like coerceToInteger,
please let me know if it is reasonable - I'll then see about getting
as.Date() to work correctly..
thank you
Vladimir Dergachev
Index: src/main/datetime.c
===================================================================
--- src/main/datetime.c (revision 40895)
+++ src/main/datetime.c (working copy)
@@ -818,9 +818,9 @@
SEXP attribute_hidden do_strptime(SEXP call, SEXP op, SEXP args, SEXP env)
{
SEXP x, sformat, ans, ansnames, klass, stz, tzone;
- int i, n, m, N, invalid, isgmt = 0, settz = 0;
+ int i, n, m, N, invalid, isgmt = 0, settz = 0, warn = 0;
struct tm tm, tm2;
- char *tz = NULL, oldtz[20] = "";
+ char *tz = NULL, oldtz[20] = "", *p;
double psecs = 0.0;
checkArity(op, args);
@@ -859,10 +859,15 @@
tm.tm_year = tm.tm_mon = tm.tm_mday = tm.tm_yday =
tm.tm_wday = NA_INTEGER;
tm.tm_isdst = -1;
- invalid = STRING_ELT(x, i%n) == NA_STRING ||
- !R_strptime(CHAR(STRING_ELT(x, i%n)),
- CHAR(STRING_ELT(sformat, i%m)), &tm, &psecs);
+ invalid = STRING_ELT(x, i%n) == NA_STRING;
if(!invalid) {
+ invalid = !(p=R_strptime(CHAR(STRING_ELT(x, i%n)),
+ CHAR(STRING_ELT(sformat, i%m)), &tm, &psecs)) ||
+ (*p);
+ warn |= invalid;
+ }
+
+ if(!invalid) {
/* Solaris sets missing fields to 0 */
if(tm.tm_mday == 0) tm.tm_mday = NA_INTEGER;
if(tm.tm_mon == NA_INTEGER || tm.tm_mday == NA_INTEGER
@@ -901,6 +906,8 @@
}
if(settz) reset_tz(oldtz);
+ if(warn) warning(_("NAs introduced by coercion"));
+
UNPROTECT(3);
return ans;
}
More information about the R-devel
mailing list