[R] Mixed format

Chris Evans chr|@ho|d @end|ng |rom p@yctc@org
Tue Jan 21 10:22:24 CET 2020


I think that might risk giving the wrong date for a date like 1/3/1990 which I think in Val's data is mdy data not dmy.  

As I read the data, where the separator is "/" the format is mdy and where the separator is "-" it's dmy.  So I would
go for:

library(lubridate)
DFX$dnew[grep("-", DFX$ddate, fixed = TRUE)] <- dmy(DFX$ddate[grep("-", DFX$ddate, fixed = TRUE)])
DFX$dnew[grep("/", DFX$ddate, fixed = TRUE)] <- mdy(DFX$ddate[grep("/", DFX$ddate, fixed = TRUE)])
DFX <- DFX[!is.na(DFX$dnew),]
DFX

  name      ddate       dnew
1    A   19-10-02 2002-10-19
2    B   22-11-20 2020-11-22
3    C   19-01-15 2015-01-19
4    D 11/19/2006 2006-11-19
5    F   9/9/2011 2011-09-09
6    G 12/29/2010 2010-12-29

But I am so much in awe of Rui's skills with R, and that of most of the regular commentators here, that I submit
this a little nervously!

Many thanks to all who teach me so much here, lovely, if I am correct, to contribute for a change!

Chris


----- Original Message -----
> From: "Rui Barradas" <ruipbarradas using sapo.pt>
> To: "Val" <valkremk using gmail.com>, "r-help using R-project.org (r-help using r-project.org)" <r-help using r-project.org>
> Sent: Tuesday, 21 January, 2020 00:40:29
> Subject: Re: [R] Mixed format

> Hello,
> 
> The following strategy works with your data.
> It uses the fact that most dates are in one of 3 formats, dmy, mdy, ymd.
> It tries those formats one by one, after each try looks for NA's in the
> new column.
> 
> 
> # first round, format is dmy
> DFX$dnew <- lubridate::dmy(DFX$ddate)
> na <- is.na(DFX$dnew)
> 
> # second round, format is mdy
> DFX$dnew[na] <- lubridate::mdy(DFX$ddate[na])
> na <- is.na(DFX$dnew)
> 
> # last round, format is ymd
> DFX$dnew[na] <- lubridate::ymd(DFX$ddate[na])
> 
> # remove what didn't fit any format
> DFX <- DFX[!is.na(DFX$dnew), ]
> DFX
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
> Às 22:58 de 20/01/20, Val escreveu:
>> Hi All,
>> 
>> I have a data frame where one column is  a mixed date format,
>> a date in the form "%m-%d-%y"  and "%m/%d/%Y", also some are not in date format.
>> 
>> Is there a way to delete the rows that contain non-dates  and
>> standardize the dates in one date format like  %m-%d-%Y?
>> Please see my  sample data and desired output
>> 
>> DFX<-read.table(text="name ddate
>>    A  19-10-02
>>    B  22-11-20u
>>    C  19-01-15
>>    D  11/19/2006
>>    F  9/9/2011
>>    G  12/29/2010
>>    H  DEX",header=TRUE)
>> 
>> Desired output
>> name ddate
>> A  19-10-2002
>> B  22-11-2020
>> C  19-01-2015
>> D  11-19-2006
>> F  09-09-2011
>> G  12-29-2010
>> 
>> Thank you
>> 
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chris Evans <chris using psyctc.org> Visiting Professor, University of Sheffield <chris.evans using sheffield.ac.uk>
I do some consultation work for the University of Roehampton <chris.evans using roehampton.ac.uk> and other places
but <chris using psyctc.org> remains my main Email address.  I have a work web site at:
   https://www.psyctc.org/psyctc/
and a site I manage for CORE and CORE system trust at:
   http://www.coresystemtrust.org.uk/
I have "semigrated" to France, see: 
   https://www.psyctc.org/pelerinage2016/semigrating-to-france/ 
That page will also take you to my blog which started with earlier joys in France and Spain!

If you want to book to talk, I am trying to keep that to Thursdays and my diary is at:
   https://www.psyctc.org/pelerinage2016/ceworkdiary/
Beware: French time, generally an hour ahead of UK.



More information about the R-help mailing list