[R] Different date formats in one column

Jeff Newmiller jdnewmil at dcn.davis.ca.us
Fri Jun 30 02:23:23 CEST 2017


Left as an exercise for the student. 
-- 
Sent from my phone. Please excuse my brevity.

On June 29, 2017 7:25:36 PM EDT, Farnoosh Sheikhi <farnoosh_81 at yahoo.com> wrote:
>Thanks Jeff. This is a nice way of solving this problem. What about the
>cases with 0015-02-21?Many thanks. Best,Farnoosh
>
> 
>
>On Wednesday, June 28, 2017 10:49 PM, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us> wrote:
> 
>
> I doubt your actual file looks like the mess that made it to my email 
>software (below) because you posted HTML-format email. Read the Posting
>
>Guide, and in particular figure out how to send plain text email.
>
>You might try the "anytime" contributed package, though I suspect it
>too 
>will choke on your mess. Otherwise, that will pretty much leave only a 
>brute-force series of regular expression tests to recognize which date 
>format patterns you have, and even that may not be able to get them all
>
>right unless you know something that limits the range of possible
>formats.
>
>Below is an example of how this can be done. There are many tutorials
>on 
>the internet that describe regular expressions... they are not unique
>to 
>R.
>
>#-----
>dta <- read.table( text=
>"DtStr
>020917
>2/22/17
>May-2-2015
>May-12-15
>", header=TRUE, as.is=TRUE )
>
>dta$Dt <- as.Date( NA )
>
>idx <- grepl( 
>"^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)-[0-9]+-[0-9]{4}$", 
>dta$DtStr, perl=TRUE, ignore.case = TRUE )
>dta$Dt[ idx ] <- as.Date( dta$DtStr[ idx ], format="%B-%d-%Y" )
>
>idx <- grepl( 
>"^(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)-[0-9]+-[0-9]{2}$", 
>dta$DtStr, perl=TRUE, ignore.case = TRUE )
>dta$Dt[ idx ] <- as.Date( dta$DtStr[ idx ], format="%B-%d-%y" )
>
>idx <- grepl( "^(0[1-9]|1[0-2])[0-9]{2}[0-9]{2}$", dta$DtStr, perl=TRUE
>)
>dta$Dt[ idx ] <- as.Date( dta$DtStr[ idx ], format="%m%d%y" )
>
>idx <- grepl( "^([1-9]|1[0-2])/[0-9]{1,2}/[0-9]{2}$", dta$DtStr,
>perl=TRUE 
>)
>dta$Dt[ idx ] <- as.Date( dta$DtStr[ idx ], format="%m/%d/%y" )
>
>
>On Wed, 28 Jun 2017, Farnoosh Sheikhi via R-help wrote:
>
>> Hi, 
>> I have a data set with various date formats in one column and not
>sure how to unify it.Here is a few formats:
>>
>02091702/22/170221201703/17/160015-08-239/2/1500170806May-2-201522-March-2014
>> I tried parse_date_time from lubridate library but it failed.Thanks
>so much. Best,Farnoosh
>>
>>
>>     [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>---------------------------------------------------------------------------
>Jeff Newmiller                        The    .....      .....  Go
>Live...
>DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.      ##.#.  Live
>Go...
>                                      Live:  OO#.. Dead: OO#..  Playing
>Research Engineer (Solar/Batteries            O.O#.      #.O#.  with
>/Software/Embedded Controllers)              .OO#.      .OO#. 
>rocks...1k
>---------------------------------------------------------------------------
>
>   



More information about the R-help mailing list