[R] (FULL) Need help to optimize a piece of code involving zoo objects
Sergey Goriatchev
sergeyg at gmail.com
Fri Jun 19 09:36:50 CEST 2009
(Sorry, sent the message before I finished it)
Hello, everyone
I have a long script that uses zoo objects. In this script I used
simple moving averages and these I can very efficiently calculate with
filter() functions.
Now, I have to use special "exponential" moving averages, and the only
way I could write the code was with a for-loop, which makes everything
extremely slow.
I don't know how to optimize the code, but I need to find a solution.
I hope someone can help me.
The special moving average is calculated in the following way:
EMA = ( K x ( C - P ) ) + P
where,
C = Current Value
P = Previous periods EMA (A SMA is used for the first period's calculation)
K = Exponential smoothing constant
K = 2 / ( 1 + Periods )
Below is the code with the for-loop.
-"temp" contains C
-Periods is variable "j" in the for loop (so K varies)
- I first produce a vector of simple equally weighted moving average,
and use the first non-NA value to initiate the second for-loop
x.Date <- as.Date("2003-02-01") + seq(1,1100) - 1
temp <- zoo(rnorm(1100, 0, 10)+100, x.Date)
start.time <- proc.time()
for(j in seq(5,100,by=5)){
#PRODUCE FAST MOVING AVERAGE
#Create equally weighted MA vector (we need only the first value)
smafast <- zoo(coredata(filter(coredata(temp[,1]), filter=rep(1/j,
j), sides=1)), order.by=time(temp))
#index of first non-NA value, which is the first SMA needed
#which(is.na(smafast))[length(which(is.na(smafast)))]+1
#Calculate decay factor K
#number of periods is j
K <- 2/(1+j)
#Calculate recursively the EMA for the fast index (starting with
second non-NA value)
for (k in
(which(is.na(smafast))[length(which(is.na(smafast)))]+2):length(smafast))
{
smafast[k] <-
coredata(smafast[k-1])+K*(coredata(temp[k,1])-coredata(smafast[k-1]))
}
#PRODUCE SLOW MOVING AVERAGE
#Create equally weighted MA vector (we need only the first value)
smaslow <- zoo(coredata(filter(coredata(temp[,1]),
filter=rep(1/(j*4), (j*4)), sides=1)), order.by=time(temp))
K <- 2/(1+j*4)
#Calculate EMA
for (k in
(which(is.na(smaslow))[length(which(is.na(smaslow)))]+2):length(smaslow))
{
smaslow[k] <-
coredata(smaslow[k-1])+K*(coredata(temp[k,1])-coredata(smaslow[k-1]))
}
#COMBINE DIFFERENCES OF FAST AND SLOW
temp <- merge(temp, ma=smafast-smaslow)
}
proc.time()-start.time
--------------------------------------------------------------------------------------
Could someone help me to optimize the two EMA for-loops within the
bigger for-loop?
I need to cut the execution time down by at least half.
Thank you in advance for your help!
Best,
Sergey
More information about the R-help
mailing list