[R] Converting string to date
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Sun Aug 4 21:59:58 CEST 2013
On Sun, 4 Aug 2013, Ron Michael wrote:
> Hi,
>
> I want to convert following string to a Date format (mm/dd/yyyy):
>
> MyString <- c("Sun Sep 01 00:00:00 EDT 2013", "Sun Dec 01 00:00:00 EST 2013")
>
> Can somebody point me if it is possible to do that?
I think the answer to "is it possible" is a qualified yes... most things
are possible if you limit your scope enough.
EST and EDT are part of an informal timezone identification system that is
not standardized around the world, so your format is not strictly
unambiguous. For example, EDT can refer to daylight savings time in zone
-0500, or to daylight savings time in zone +1100. [1] You have to
interpret the EST/EDT notation according to your local expectations, and
be careful not to apply it to data that falls outside your local
assumptions. Here, I assume you mean to handle data from the eastern area
of the United States:
MyString <- c( "Sun Sep 01 00:00:00 EDT 2013"
, "Sun Dec 01 00:00:00 EST 2013"
, "Sun Dec 01 00:00:00 AST 2013" )
Sys.setenv( TZ="America/New_York" )
MyStringFixed <- MyString
MyStringFixed <- sub( 'EST', '-0500', MyStringFixed )
MyStringFixed <- sub( 'EDT', '-0400', MyStringFixed )
MyStringFixed <- sub( 'AST', '-0400', MyStringFixed )
# see ?strptime for format string definitions
MyDateTm <- as.POSIXct( MyStringFixed, format="%a %b %d %H:%M:%S %z %Y" )
MyDateStr <- as.character( MyDateTm, format="%m/%d/%Y" )
As my code shows, you also need to be aware that only one timezone can be
in effect for a particular POSIXct vector... that is, although you may be
using different timezones for each input string, they are all converted
internally to UTC and displayed according to the tz attribute of the whole
vector (in this case the empty string, which refers to the TZ environment
variable). As a result, in the general case of arbitrary input timezones,
when you convert to date format then the date that is shown may not be
the same as the original "date" in the input strings (because of timezone
differences).
If you want to cover your ears and eyes and say "Timezones don't exist"
then you can simply strip out the timezone information from the input
strings before you convert them:
MyStringFixed <- MyString
MyStringFixed <- sub( '[0-9][0-9]:[0-9][0-9]:[0-9][0-9] [^ ]+ '
, ''
, MyStringFixed )
MyDateTm <- as.POSIXct( MyStringFixed, format="%a %b %d %Y" )
MyDateStr <- as.character( MyDateTm, format="%m/%d/%Y" )
---
[1] http://www.timeanddate.com/library/abbreviations/timezones/
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
More information about the R-help
mailing list