[R] as.Date() results depend on order of data within vector?
Patrick Connolly
p_connolly at ihug.co.nz
Sun Jan 7 20:42:32 CET 2007
On Sun, 07-Jan-2007 at 12:01PM +0000, Mark Wardle wrote:
|> Dear all,
|>
|> The as.Date() function appears to give different results depending on
|> the order of the vector passed into it.
|>
|> d1 = c("1900-01-01", "2007-01-01","","2001-05-03")
|> d2 = c("", "1900-01-01", "2007-01-01","2001-05-03")
|> as.Date(d1) # gives correct results
|> as.Date(d2) # fails with error (* see below)
|>
|> This problem does not arise if the dates are NA rather than an empty
|> string, but my data is coming via RODBC and I still don't have NAs
|> passed across properly.
|>
|> I might add that I initially noticed this behaviour when using RODBC's
|> sqlQuery() function call, and I initially had difficulty explaining why
|> one column of dates was passed correctly, but another failed. The
|> failing column was a "date of death" column where it was NA ("") for
|> most patients.
|>
|> I've come up with two workarounds that work. The first is to sort the
|> data at the SQL level, ensuring the initial record is not null. The
|> second is to use sqlQuery() with as.is=T option, and then do the sorting
|> and conversion afterwards.
Simpler, I think, is to add one line
d2[d2 == ""] <- NA
I've not tested the idea extensively, so there might be occasions
where it falls down. If you're working with a dataframe, you can use
one of the apply functions to effect all columns.
HTH
--
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
___ Patrick Connolly
{~._.~} Great minds discuss ideas
_( Y )_ Middle minds discuss events
(:_~*~_:) Small minds discuss people
(_)-(_) ..... Anon
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
More information about the R-help
mailing list