[R] Unexpected date format coercion

Enrico Schumann e@ @end|ng |rom enr|co@chum@nn@net
Thu Jul 1 12:46:08 CEST 2021


On Thu, 01 Jul 2021, Jeremie Juste writes:

> Hello 
>
> On Thursday,  1 Jul 2021 at 08:25, PIKAL Petr wrote:
>> Hm.
>>
>> Seems to me, that both your codes are wrong but printing in Linux is
>> different from Windows.
>>
>> With
>> as.Date("20-12-2020","%Y-%m-%d")
>> you say that 20 is year (actually year 20) and 2020 is day and only first
>> two values are taken (but with some valueas result is NA)
>>
>> I can confirm 4.0.3 in Windows behaves this way too.
>>> as.Date("20-12-2020","%Y-%m-%d")
>> [1] "0020-12-20"
>
> Many thanks for confirming this.
>
>
> On Thursday,  1 Jul 2021 at 18:22, Jim Lemon wrote:
>> Hi Jeremie,
>> Try:
>>
>> as.Date("20-12-2020","%y-%m-%d")
>> [1] "2020-12-20"
>
> Thanks for this info. I'm looking for something that produce NA if the
> date is not exactly in the specified format so that it can be
> corrected. I was relying on the format parameter of the date for that.
>
> The issue is that there can be so many variations in date format that for the time
> being I still find it easier to delegate the correction to the user. A
> particular nasty case is when there are multiple date format in the same
> column.
>
>
> Best regards,
> Jeremie
>

You could explicitly test whether the specified format
is as expcected, perhaps with a regex such as

    s <- c("2020-01-20", "20-12-2020")
    grepl("^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]$", s)

and/or by checking the components of the dates:

    valid_Date <- function(s) {
        tmp <- strsplit(s, "[-]")
    
        year <- as.numeric(sapply(tmp, `[[`, 1))
        valid.year <- year < 2500 & year > 1800
    
        month <- as.numeric(sapply(tmp, `[[`, 2))
        valid.month <- month >= 0 & month <= 12
    
        day <- as.numeric(sapply(tmp, `[[`, 3))
        valid.day <- day >= 1 & day <= 31
    
        ans <- as.Date(s)
        ans[!(valid.year & valid.month & valid.day)] <- NA
        ans    
    }



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net



More information about the R-help mailing list