[R] Changing time intervals in data set

Chris Evans chr|@ho|d @end|ng |rom p@yctc@org
Thu Dec 16 08:06:03 CET 2021


What you said earlier was:

> >> The data.frame/tibble has columns for year, month, day, hour, minute, and
> >> datetime.

> As well as a site_nbr.

> What I asked is,

> >> Would difftime() allow me to find the dates when the changes occurred?

For me the next step, in tidyverse pseudocode, might be something like:

tibData %>%
   arrange(nbr, datetime) %>% # just in case things are not ordered nicely
   group_by(site_nbr) %>% # as you want to get changes within site I think
   mutate(gapTime = datetime - lag(datetime)) %>% # get the simple gaps
   summarise(nGaps = n_distinct(gapTime)) # get the number of gaps per site

(untested, may be flawed but it conveys the ideas)

>From what you are saying that will get you numbers of time gap changes per
site.  That will help you work out how many are simple failures of sensors
etc. (would they come up as multiples of that site's then usual interval,
or might they be more complex?)  In the light of that you can start the 
somewhat more challenging issue of disentangling those from more long 
lasting switches in a site's gapTime value.  I am sure I can offer some
thoughts on that in the light of what you find but the best solutions will
depend on the number of sites and on what those distributions of changes
within site look like.  

Disclaimer: I am not a professional statistician nor a professional R 
coder though I do spend much of each week hacking up R code that works
and supports publications.  Others here are professional statisticians
_and_ professional R coders.

Very best and seasonal greetings to all,

Chris


----- Original Message -----
> From: "Rich Shepard" <rshepard using appl-ecosys.com>
> To: "r-help mailing list" <r-help using r-project.org>
> Sent: Wednesday, 15 December, 2021 23:42:42
> Subject: Re: [R] Changing time intervals in data set

> On Thu, 16 Dec 2021, Jim Lemon wrote:
> 
>> From what you sent, it seems like you want to find where the change in
>> _measurement interval_ occurred. That looks to me as though it is the
>> first datetime in each row. In the first row, there is a week gap between
>> the ten and fifteen minute intervals. This may indicate that no
>> measurements were taken or perhaps they were lost.
> 
> Jim,
> 
> Yes, there are times when the equipment fails, but not all changes in
> measurement intervals have a time gap other than a few minutes.
> 
> Normally I work with much smaller data sets so if there are interval changes
> they've not appeared in the data I've used.
> 
> I will learn from the USGS why there are so many measurement interval
> changes.
> 
> Because these data are so different from what I've seen in the past I want
> to explore whether (or how) they affect discharge variability calculations.
> 
> Regards,
> 
> Rich


-- 
Chris Evans (he/him) <chris using psyctc.org> 
Visiting Professor, UDLA, Quito, Ecuador & Honorary Professor, University of Roehampton, London, UK.
Work web site: https://www.psyctc.org/psyctc/ 
CORE site:     https://www.coresystemtrust.org.uk/
Personal site: https://www.psyctc.org/pelerinage2016/
OMbook:        https://ombook.psyctc.org/book/



More information about the R-help mailing list