[R] Trimming time series to only include complete years
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Tue May 31 00:15:54 CEST 2016
Sorry, I put too many bugs (opportunities for excellence!) in this on my
first pass on this to leave it alone :-(
isPartialWaterYear2 <- function( d ) {
dtl <- as.POSIXlt( d )
wy1 <- cumsum( ( 9 == dtl$mon ) & ( 1 == dtl$mday ) )
# any 0 in wy1 corresponds to first partial water year
result <- 0 == wy1
# if last day is not Sep 30, mark last water year as partial
if ( 8 != dtl$mon[ length( d ) ]
| 30 != dtl$mday[ length( d ) ] ) {
result[ wy1[ length( d ) ] == wy1 ] <- TRUE
}
result
}
dat2 <- dat[ !isPartialWaterYear( dat$Date ), ]
On Sat, 28 May 2016, Jeff Newmiller wrote:
> # read about POSIXlt at ?DateTimeClasses
> # note that the "mon" element is 0-11
> isPartialWaterYear <- function( d ) {
> dtl <- as.POSIXlt( dat$Date )
> wy1 <- cumsum( ( 9 == dtl$mon ) & ( 1 == dtl$mday ) )
> ( 0 == wy1 # first partial year
> | ( 8 != dtl$mon[ nrow( dat ) ] # end partial year
> & 30 != dtl$mday[ nrow( dat ) ]
> ) & wy1[ nrow( dat ) ] == wy1
> )
> }
>
> dat2 <- dat[ !isPartialWaterYear( dat$Date ), ]
>
> The above assumes that, as you said, the data are continuous at one-day
> intervals, such that the only partial years will occur at the beginning and
> end. The "diff" function could be used to identify irregular data within the
> data interval if needed.
>
> On Fri, 27 May 2016, Morway, Eric wrote:
>
>> In bulk processing streamflow data available from an online database, I'm
>> wanting to trim the beginning and end of the time series so that daily data
>> associated with incomplete "water years" (defined as extending from Oct 1st
>> to the following September 30th) is trimmed off the beginning and end of
>> the series.
>>
>> For a small reproducible example, the time series below starts on
>> 2010-01-01 and ends on 2011-11-05. So the data between 2010-01-01 and
>> 2010-09-30 and also between 2011-10-01 and 2011-11-05 is not associated
>> with a complete set of data for their respective water years. With the
>> real data, the initial date of collection is arbitrary, could be 1901 or
>> 1938, etc. Because I'm cycling through potentially thousands of records, I
>> need help in designing a function that is efficient.
>>
>> dat <-
>> data.frame(Date=seq(as.Date("2010-01-01"),as.Date("2011-11-05"),by="day"))
>> dat$Q <- rnorm(nrow(dat))
>>
>> dat$wyr <- as.numeric(format(dat$Date,"%Y"))
>> is.nxt <- as.numeric(format(dat$Date,"%m")) %in% 1:9
>> dat$wyr[!is.nxt] <- dat$wyr[!is.nxt] + 1
>>
>>
>> function(dat) {
>> ...
>> returns a subset of dat such that dat$Date > xxxx-09-30 & dat$Date <
>> yyyy-10-01
>> ...
>> }
>>
>> where the years between xxxx-yyyy are "complete" (no missing days). In the
>> example above, the returned dat would extend from 2010-10-01 to 2011-09-30
>>
>> Any offered guidance is very much appreciated.
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ---------------------------------------------------------------------------
> Jeff Newmiller The ..... ..... Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
More information about the R-help
mailing list