[R] Time and date conversion
MacQueen, Don
m@cqueen1 @end|ng |rom ||n|@gov
Thu Jun 7 01:26:58 CEST 2018
After you've solved the format inconsistency issues, per Peter's advice, you will need to understand that R internally converts and stores the timedate values in UTC. Therefore, it is absolutely essential to give it the correct timezone specification on input.
The user does not "convert to UTC time-zone". Instead, you tell it to format in UTC when you print. If you don't specify a timezone on input, the default timezone will be your local timezone.
Here's an example of how to do it right, assuming "CT" is meant to be the US central timezone.
> t1 <- '2018-02-03 11:15:17 CT'
> t1t <- as.POSIXct(t1, tz='US/Central')
> print(t1t)
[1] "2018-02-03 11:15:17 CST"
> format(t1t, tz='UTC')
[1] "2018-02-03 17:15:17"
UTC is 6 hours ahead of US central time zone in February, so the displayed UTC hour is "17" instead of the US central "11".
The "CT" is ignored on input.
Note that there is no R command to convert from US/Central to UTC. There is only formatting. The actual data itself does not change.
t1u <- '2018-02-03 17:15:17'
t1ut <- as.POSIXct(t1u, tz='UTC')
> as.numeric(t1t)
[1] 1517678117
> as.numeric(t1ut)
[1] 1517678117
If the input time were during so-called daylight savings time (say, in June), the difference would be 5 hours; the UTC formatted hour would be "16".
------- further comments -------
There is some danger in using as.POSIXct, because it does not force you to supply a timezone (whereas strptime does).If no tz is supplied, as.POSIXct will default to the sessions timezone (PST in my case):
> t1p <- as.POSIXct(t1)
> print(t1p)
[1] "2018-02-03 11:15:17 PST"
My system does not recognized "CT" or "CST" as valid timezone codes:
> t1t <- as.POSIXct(t1, tz='CT')
Warning messages:
1: In strptime(xx, f <- "%Y-%m-%d %H:%M:%OS", tz = tz) :
unknown timezone 'CT'
2: In as.POSIXct.POSIXlt(x) : unknown timezone 'CT'
3: In strptime(x, f, tz = tz) : unknown timezone 'CT'
4: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
unknown timezone 'CT'
> t1t <- as.POSIXct(t1, tz='CST')
Warning messages:
1: In strptime(xx, f <- "%Y-%m-%d %H:%M:%OS", tz = tz) :
unknown timezone 'CST'
2: In as.POSIXct.POSIXlt(x) : unknown timezone 'CST'
3: In strptime(x, f, tz = tz) : unknown timezone 'CST'
4: In as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...) :
unknown timezone 'CST'
R uses the underlying operating system date/time libraries to recognize when an input datetime is during daylight savings time, and converts to UTC accordingly. Therefore, if your incoming character strings are standard time all year around (a standard practice for some kinds of realtime data collection processes, such as for meteorological data), the above timezone codes won't work. You would have to use
t1ut <- as.POSIXct(t1u, tz='Etc/GMT+6')
for the US central timezone (on a Mac or Linux box; I don't know about Windows).
As far as I understand it, the only way to specify the timezone when coverting from character to datetime is using the 'tz' argument. A timezone as part of the character string will be ignored (see the formatting codes in ?strptime).
I almost always use as.POSIXct() instead of strptime() for conversion from character to datetime, because strptime() returns class POSIXlt, and I generally find POSIXct more appropriate for how I work with datetime data.
-Don
--
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
Lab cell 925-724-7509
On 6/4/18, 3:54 AM, "R-help on behalf of peter dalgaard" <r-help-bounces using r-project.org on behalf of pdalgd using gmail.com> wrote:
> On 4 Jun 2018, at 10:45 , Christofer Bogaso <bogaso.christofer using gmail.com> wrote:
>
> Hi,
>
> I have an automatic data feed and I obtained a Date vector in the following
> format:
>
>> Date
> [1] "03 Jun 2018 10:01 am CT" "01 Jun 2018 22:04:25 pm CT"
>
> I now like to convert it to UTC time-zone
>
> Is there any easy way to convert them so, particularly since 1st element
> doesnt have any Second element whereas the 2nd element has.
..and it also mixes up am/pm notation and 24hr clock.
There are two basic approaches to the format inconsistency thing:
(A) preprocess using gsub() constructions
> gsub(" (..:..) ", " \\1:00 ", d.txt)
[1] "03 Jun 2018 10:01:00 am CT" "01 Jun 2018 22:04:25 pm CT"
(B) Try multiple formats
> d <- strptime(d.txt, format="%d %B %Y %H:%M:%S %p")
> d[is.na(d)] <- strptime(d.txt[is.na(d)], format="%d %B %Y %H:%M %p")
> d
[1] "2018-06-03 10:01:00 CEST" "2018-06-01 22:04:25 CEST"
I would likely go for (A) since you probably need to do something gsub-ish to get the TZ thing in place.
-pd
>
> Thanks for any pointer.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list