[R] convert 'character' vector containing mixed formats to 'Date'
Duncan Murdoch
murdoch.duncan at gmail.com
Thu Jun 21 15:13:00 CEST 2012
On 12-06-21 8:48 AM, Liviu Andronic wrote:
> Dear all
> I have a 'character' vector containing mixed formats (thanks Excel!)
> and I'd like to translate it into a default "%Y-%m-%d" Date vector.
> x<- c("1/3/2005", "13/04/2004", "2/5/2005", "2/5/2005", "7/5/2007",
> "22/04/2004", "21/04/2005", "20080430", "13/05/2003", "20080529",
> NA, NA, "19/05/1999", "17/05/2000", "17/05/2000")
>
>
> In the above you will see that some dates are of format="%d/%m/%Y",
> others of format="%Y%m%d" and some NA values. Can you suggest a
> straight-forward way of transforming these to a uniform 'character' or
> 'Date' vector? I tried to do the following, but it outputs very
> strange results:
>> x
> [1] "1/3/2005" "13/04/2004" "2/5/2005" "2/5/2005" "7/5/2007"
> "22/04/2004"
> [7] "21/04/2005" "20080430" "13/05/2003" "20080529" NA
> NA
> [13] "19/05/1999" "17/05/2000" "17/05/2000"
>> sum(xa<- grepl('/', x))
> [1] 11
>> sum(xb<- grepl('200', substr(x, 1,4)))
> [1] 2
>> sum(xc<- is.na(x))1
> [1] 2
>> x[xa]<- as.Date(x[xa], format="%d/%m/%Y")
>> x[xb]<- as.Date(x[xb], format="%Y%m%d")
>> x
> [1] "12843" "12521" "12905" "12905" "13640" "12530" "12894" "13999"
> "12185" "14028"
> [11] NA NA "10730" "11094" "11094"
>
>
> The culprit is likely that the 'x' vector is 'character' throughout,
> but I'm not sure how to work around. For example, I couldn't figure
> how to create an empty 'Date' vector. Regards
You probably don't want the vector to be empty, so something like this
would work:
y <- as.Date(rep(NA, 15))
Then things like
y[xa] <- as.Date(x[xa], format="%d/%m/%Y")
etc. should work.
Duncan Murdoch
More information about the R-help
mailing list