[R] How to average time series data around regular intervals
jim holtman
jholtman at gmail.com
Mon Aug 27 19:40:25 CEST 2012
try this:
> x <- read.table(text = "2012-07-22 12:12:00, 21
+ 2012-07-22 12:15:00, 22
+ 2012-07-22 12:18:00, 24
+ 2012-07-22 12:39:00, 21
+ 2012-07-22 12:45:00, 25
+ 2012-07-22 12:49:00, 26
+ 2012-07-22 12:53:00, 20
+ 2012-07-22 13:00:00, 18
+ 2012-07-22 13:06:00, 22", colClasses = c("POSIXct", "integer"), sep = ',')
> # get minimum at an hour granularity
> tMin <- trunc(min(x$V1), units = 'hour')
> # back off 7.5 minute
> tMin <- tMin - (7.5 * 60)
> # create sequence for 'cut'
> cSeq <- seq(tMin, max(x$V1) + (7.5 * 60), by = '15 min')
> # now split and average
> cCut <- cut(x$V1, cSeq)
> # compute means
> tapply(x$V2, cCut, mean)
2012-07-22 11:52:30 2012-07-22 12:07:30 2012-07-22 12:22:30 2012-07-22 12:37:30
NA 22.33333 NA 24.00000
2012-07-22 12:52:30
20.00000
>
On Mon, Aug 27, 2012 at 9:53 AM, Jason Gilmore <wj at wjgilmore.com> wrote:
>
> Hi,
>
> I'm pretty new to R and have run into a task which although I'm certain is
> within R's capabilities, falls outside of mine. :-) Consider the following
> data set:
>
> 2012-07-22 12:12:00, 21
> 2012-07-22 12:15:00, 22
> 2012-07-22 12:18:00, 24
> 2012-07-22 12:39:00, 21
> 2012-07-22 12:45:00, 25
> 2012-07-22 12:49:00, 26
> 2012-07-22 12:53:00, 20
> 2012-07-22 13:00:00, 18
> 2012-07-22 13:06:00, 22
>
> My task involves creating a data set which *averages* these values at a
> resolution of 15 minutes, meaning that I need to average the values falling
> within 7.5 minutes of a 15 minute increment. Therefore given the above data
> set I need to treat it as three groups:
>
> 2012-07-22 12:12:00, 21
> 2012-07-22 12:15:00, 22
> 2012-07-22 12:18:00, 24
>
> 2012-07-22 12:39:00, 21
> 2012-07-22 12:45:00, 25
> 2012-07-22 12:49:00, 26
>
> 2012-07-22 12:53:00, 20
> 2012-07-22 13:00:00, 18
> 2012-07-22 13:06:00, 22
>
> The end result should look like this:
>
> 2012-07-22 12:15:00, 22.33
> 2012-07-22 12:30:00, NA <- Because this 15 minute slot did not previously
> exist
> 2012-07-22 12:45:00, 24
> 2012-07-22 1:00:00, 20
>
> Any help much appreciated. I've been working on this for several hours with
> little success. I'm able to identify the missing (NA) value using zoo/xts
> but can't seem to sort out the averaging matter.
>
> Thanks so much!
> Jason
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
More information about the R-help
mailing list