[R] Trimming time series to only include complete years

Morway, Eric emorway at usgs.gov
Fri May 27 20:04:17 CEST 2016


In bulk processing streamflow data available from an online database, I'm
wanting to trim the beginning and end of the time series so that daily data
associated with incomplete "water years" (defined as extending from Oct 1st
to the following September 30th) is trimmed off the beginning and end of
the series.

For a small reproducible example, the time series below starts on
2010-01-01 and ends on 2011-11-05.  So the data between 2010-01-01 and
2010-09-30 and also between 2011-10-01 and 2011-11-05 is not associated
with a complete set of data for their respective water years.  With the
real data, the initial date of collection is arbitrary, could be 1901 or
1938, etc.  Because I'm cycling through potentially thousands of records, I
need help in designing a function that is efficient.

dat <-
data.frame(Date=seq(as.Date("2010-01-01"),as.Date("2011-11-05"),by="day"))
dat$Q <- rnorm(nrow(dat))

dat$wyr <- as.numeric(format(dat$Date,"%Y"))
is.nxt <- as.numeric(format(dat$Date,"%m")) %in% 1:9
dat$wyr[!is.nxt] <- dat$wyr[!is.nxt] + 1


function(dat) {
   ...
   returns a subset of dat such that dat$Date > xxxx-09-30 & dat$Date <
yyyy-10-01
   ...
}

where the years between xxxx-yyyy are "complete" (no missing days).  In the
example above, the returned dat would extend from 2010-10-01 to 2011-09-30

Any offered guidance is very much appreciated.

	[[alternative HTML version deleted]]



More information about the R-help mailing list