[R] Calculate daily means from 5-minute interval data

Rich Shepard r@hep@rd @end|ng |rom @pp|-eco@y@@com
Fri Sep 3 21:17:53 CEST 2021


On Thu, 2 Sep 2021, Jeff Newmiller wrote:

> Regardless of whether you use the lower-level split function, or the
> higher-level aggregate function, or the tidyverse group_by function, the
> key is learning how to create the column that is the same for all records
> corresponding to the time interval of interest.

Jeff,

I definitely agree with the above

> If you convert the sampdate to POSIXct, the tz IS important, because most
> of us use local timezones that respect daylight savings time, and a naive
> conversion of standard time will run into trouble if R is assuming
> daylight savings time applies. The lubridate package gets around this by
> always assuming UTC and giving you a function to "fix" the timezone after
> the conversion. I prefer to always be specific about timezones, at least
> by using so something like
>    Sys.setenv( TZ = "Etc/GMT+8" )
> which does not respect daylight savings.

I'm not following you here. All my projects have always been in a single
time zone and the data might be recorded at June 19th or November 4th but do
not depend on whether the time is PDT or PST. My hosts all set the hardware
clock to local time, not UTC.

As the location(s) at which data are collected remain fixed geographically I
don't understand why daylight savings time, or non-daylight savings time is
important.

> Regarding using character data for identifying the month, in order to have
> clean plots of the data I prefer to use the trunc function but it returns
> a POSIXlt so I convert it to POSIXct:

I don't use character data for months, as far as I know. If a sample data
is, for example, 2021-09-03 then monthly summaries are based on '09', not
'September.'

I've always valued your inputs to help me understand what I don't. In this
case I'm really lost in understanding your position.

Have a good Labor Day weekend,

Rich



More information about the R-help mailing list