[R] Accelerating the calculation of the moving average

Gabor Grothendieck ggrothendieck at gmail.com
Tue Mar 22 16:19:00 CET 2011


On Tue, Mar 22, 2011 at 11:05 AM, Tonja Krueger <tonja.krueger at web.de> wrote:
>
> Dear List,
> I have a data frame with approximately 500000 rows that looks like this:
>
>  Date    time    value
>> 19.07.1956          12:00:00               4.84
> 19.07.1956          13:00:00               4.85
> 19.07.1956          14:00:00               4.89
> 19.07.1956          15:00:00               4.94
> 19.07.1956          16:00:00               4.99
> 19.07.1956          17:00:00               5.01
> 19.07.1956          18:00:00               5.04
> 19.07.1956          19:00:00               5.04
> 19.07.1956          20:00:00               5.04
> 19.07.1956          21:00:00               5.02
> 19.07.1956          22:00:00               5.01
> 19.07.1956          23:00:00               5.00
> 20.07.1956          00:00:00               4.99
> 20.07.1956          01:00:00               4.99
> 20.07.1956          02:00:00               5.00
> 20.07.1956          03:00:00               5.03
> 20.07.1956          04:00:00               5.07
> 20.07.1956          05:00:00               5.10
> 20.07.1956          06:00:00               5.14
> 20.07.1956          07:00:00               5.14
> 20.07.1956          08:00:00               5.11
> 20.07.1956          09:00:00               5.08
> 20.07.1956          10:00:00               5.03
> 20.07.1956          11:00:00               4.98
> 20.07.1956          12:00:00               4.94
> 20.07.1956          13:00:00               4.93
>>
> I want to calculate
> the moving average of the right column.
> I tried:
>
> dat$index<-1:length(dat$Zeit)
> qs<- 43800
> erg<-c()
> for (y in min(dat$index):max(dat$index)){
> m<- mean(dat[(dat$index>=y)&(dat$index<=y+qs+1),3])
> erg<-c(erg,m)
> }
>
> It does works, but it takes ages. Is there a faster way to compute the moving average?
>
> Thank you,
> Tonja Krueger

There are rolling mean or sum functions written in C in the caTools,
xts and TTR packages (and possibly other packages as well).

There are also faster ways to do it even in pure R such as the
rollmean function in zoo (although that would not be expected to be as
fast as the C implementations).

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com


More information about the R-help mailing list