[R] is parallel computing possible for 'rollapplyr' job?

Gabor Grothendieck ggrothendieck at gmail.com
Thu Apr 5 16:00:26 CEST 2012


On Thu, Apr 5, 2012 at 9:18 AM, Pam <fkiraz11 at yahoo.com> wrote:
>
> Hi,
>
> The code below does exactly what I want in sequential mode. But, it is slow and I want to run it in parallel mode. I examined some windows version packages (parallel, snow, snowfall,..) but could not solve my specific problem. As far as I understood, either I have to write a new function like sfRollapplyr or I have to change my code in a way that it utilizes lapply, or sapply instead of 'rollapplyr' first then use sfInit, sfExport, and sfLapply,.. for parallel computing. I could not perform either so please help me :)
>
> ##
> nc<-313
> rs<-500000
> ema<-10
> h<-4
> gomin1sd<-function (x,rho)
> {
> getOutliers(as.vector(x),rho=c(1,1))$limit[1]
> }
> dim(dt_l1_inp)
> [1] 500000 312
> dt_l1_min1<-matrix(nrow=rs, ncol=nc-1-(ema*h))
> for (i in 1:rs)
> {
> dt_l1_min1[i,]<-rollapplyr(dt_l1_inp[i,], FUN=gomin1sd, width=ema*h+1)
> }

Since rollapply, by default, applies the rolling calculation to each
column we can remove the loop like this (untested):

m <- t(dt_l1_inp)
w <- ema*h+1
rollapplyr(m,  w, gomin1sd)

and that might also give you a small speedup.

To take advantage of multiple processors we can run

rollapplyr(m[, seq(k)], w, gomin1sd) on the first processor,
rollapplyr(m[, k+seq(k)], w, gmin1sd) on the second processor
and so on

for suitably chosen k.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com



More information about the R-help mailing list