[R] How to average time series data around regular intervals

jim holtman jholtman at gmail.com
Mon Aug 27 19:40:25 CEST 2012


try this:


> x <- read.table(text = "2012-07-22 12:12:00, 21
+ 2012-07-22 12:15:00, 22
+ 2012-07-22 12:18:00, 24
+ 2012-07-22 12:39:00, 21
+ 2012-07-22 12:45:00, 25
+ 2012-07-22 12:49:00, 26
+ 2012-07-22 12:53:00, 20
+ 2012-07-22 13:00:00, 18
+ 2012-07-22 13:06:00, 22", colClasses = c("POSIXct", "integer"), sep = ',')
> # get minimum at an hour granularity
> tMin <- trunc(min(x$V1), units = 'hour')
> # back off 7.5 minute
> tMin <- tMin - (7.5 * 60)
> # create sequence for 'cut'
> cSeq <- seq(tMin, max(x$V1) + (7.5 * 60), by = '15 min')
> # now split and average
> cCut <- cut(x$V1, cSeq)
> # compute means
> tapply(x$V2, cCut, mean)
2012-07-22 11:52:30 2012-07-22 12:07:30 2012-07-22 12:22:30 2012-07-22 12:37:30
                 NA            22.33333                  NA            24.00000
2012-07-22 12:52:30
           20.00000
>


On Mon, Aug 27, 2012 at 9:53 AM, Jason Gilmore <wj at wjgilmore.com> wrote:
>
> Hi,
>
> I'm pretty new to R and have run into a task which although I'm certain is
> within R's capabilities, falls outside of mine. :-) Consider the following
> data set:
>
> 2012-07-22 12:12:00, 21
> 2012-07-22 12:15:00, 22
> 2012-07-22 12:18:00, 24
> 2012-07-22 12:39:00, 21
> 2012-07-22 12:45:00, 25
> 2012-07-22 12:49:00, 26
> 2012-07-22 12:53:00, 20
> 2012-07-22 13:00:00, 18
> 2012-07-22 13:06:00, 22
>
> My task involves creating a data set which *averages* these values at a
> resolution of 15 minutes, meaning that I need to average the values falling
> within 7.5 minutes of a 15 minute increment. Therefore given the above data
> set I need to treat it as three groups:
>
> 2012-07-22 12:12:00, 21
> 2012-07-22 12:15:00, 22
> 2012-07-22 12:18:00, 24
>
> 2012-07-22 12:39:00, 21
> 2012-07-22 12:45:00, 25
> 2012-07-22 12:49:00, 26
>
> 2012-07-22 12:53:00, 20
> 2012-07-22 13:00:00, 18
> 2012-07-22 13:06:00, 22
>
> The end result should look like this:
>
> 2012-07-22 12:15:00, 22.33
> 2012-07-22 12:30:00, NA <- Because this 15 minute slot did not previously
> exist
> 2012-07-22 12:45:00, 24
> 2012-07-22 1:00:00, 20
>
> Any help much appreciated. I've been working on this for several hours with
> little success. I'm able to identify the missing (NA) value using zoo/xts
> but can't seem to sort out the averaging matter.
>
> Thanks so much!
> Jason
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




--
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.




More information about the R-help mailing list