[R] Calculating rolling mean by group

Sam Albers tonightsthenight at gmail.com
Tue Jan 10 20:27:59 CET 2012


Thanks for getting me on the right path Gabor! I have one outstanding
issue though.

On Mon, Jan 9, 2012 at 4:21 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> On Mon, Jan 9, 2012 at 6:39 PM, Sam Albers <tonightsthenight at gmail.com> wrote:
>> Hello all,
>>
>> I am trying to determine how to calculate rolling means in R using a
>> grouping variable. Say I have a dataframe like so:
>>
>> dat1 <- data.frame(x = runif(2190, 0, 125), year=rep(1995:2000,
>> each=365), jday=1:365, site="here")
>> dat2 <- data.frame(x = runif(2190, 0, 200), year=rep(1995:2000,
>> each=365), jday=1:365, site="there")
>> dat <- rbind(dat1,dat2)
>>
>> ## What I would like to do is calculate a rolling 7 day mean
>> separately for each site. I have looked at both
>> ## rollmean() in the zoo package and running.mean() in the igraph
>> package but neither seem to have led
>> ## me to calculating a rolling mean by group. My first thought was to
>> use the plyr package but I am confused
>> ## by this output:
>>
>> library(plyr)
>> library(zoo)
>>
>> ddply(dat, c("site"), function(df) return(c(roll=rollmean(df$x, 7))))
>>
>> ## Can anyone recommend a better way to do this or shed some light on
>> this output?
>>
>
> Using dat in the question, try this:
>
> library(zoo)
> z <- read.zoo(dat, index = 2:3, split = 4, format = "%Y %j")
> zz <- rollmean(z, 7)
>
> The result, zz, is a multivariate zoo series with one column per group.

Using the zoo approach works well except that an wrinkle in my dataset
not reflected in the sample data caused some problems. I am actually
dealing with a situation where there is an unequal number of
observations in each group like the below data set

library(zoo)

dat1 <- data.frame(x = runif(2190, 0, 125), year=rep(1995:2000,
each=365), jday=1:365, site="here")
dat2 <- data.frame(x = runif(4380, 0, 200), year=rep(1989:2000,
each=365), jday=1:365, site="there")
dat <- rbind(dat1,dat2)

## When I use read.zoo everything is read in fine
z <- read.zoo(dat, index = 2:3, split = 4, format = "%Y %j")

## But when I use rollmean to get a 7 day average for both the 'here'
and 'there' columns only the 'there' column 7 day
## average is calculated
zz <- rollmean(z, 7)

Any thoughts on how I can then calculate a rolling mean on groups
where there is an unequal number of observations in each group?

Thanks for the previous post and in advance.

Sam

>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com



More information about the R-help mailing list