[R] Calculate daily means from 5-minute interval data

Rich Shepard r@hep@rd @end|ng |rom @pp|-eco@y@@com
Tue Aug 31 14:36:43 CEST 2021

On Tue, 31 Aug 2021, Richard O'Keefe wrote:

> By the time you get the data from the USGS, you are already far past the point
> where what the instruments can write is important.


The data are important because they show what's happened in that period of
record. Don't physicians take a medical history from patients even though
those data are far past the point they occurred?

> agency_cd site_no datetime tz_cd 71932_00060 71932_00060_cd
> 5s 15s 20d 6s 14n 10s
> (I do not know what the last line signifies.)

The numbers represent the space for each fixed-width field.

> After using read.delim to read the file
> I note that the timestamps are in a single column, formatted like
> "2020-08-30 00:15", matching the pattern "%Y-%m-%d %H:%M".
> After reading the data into R and using
> r$datetime <- as.POSIXct(r$datetime, format="%Y-%m-%d %H:%M",
>                           tz=r$tz_cd)

And I use emacs to replace the space between columns with commas so the date
and the time are separate.

> So for this data set, spanning one year, all the times are in the same time
> zone, observations are 15 minutes apart, not 5, and there are no missing
> data.  This was obviously the wrong data set.

As I provided when I first asked for suggestions:

The recorded values are 5 minutes apart.

That data set is immaterial for my project but perfect when one needs data
from that gauge station on the Rogue River.

> The flow is dominated by a series of "bursts" with a fast onset to a peak
> and a slow decay, coming in a range of sizes from quite small to rather
> large, separated by gaps of 4 to 45 days.

And when discharge is controlled by flows through a hydroelectric dam there
is a lot of variability. The pattern is important to fish as well as

> I'd be looking at
> - how do I *detect* these bursts? (detecting a peak isn't too hard,
>   but the peak is not the onset)
> - how do I *characterise* these bursts?
>   (and is the onset rate related to the peak size?)
> - what's left after taking the bursts out?
> - can I relate these bursts to something going on upstream?

Well, those questions could be appropriate depending on what questions you
need the data to answer.

Environmental data are quite different from experimental, economic,
financial, and public data (e.g., unemployment, housing costs).

There are always multiple ways to address an analytical need. Thank you for
your contributions.

Stay well,


More information about the R-help mailing list