[R] Parallel code using parLapply

Gabor Grothendieck ggrothendieck at gmail.com
Sat Dec 22 06:02:30 CET 2012


On Fri, Dec 21, 2012 at 10:42 AM, Chris Hergarten <chegaga at yahoo.com> wrote:
> Dear R-users
>
> I was running into problems with my R code trying to run clh sampling (clhs package) in parallel mode (=on various data sets simultaneously).
>
> Here is the code (which I developed with some help:)):
> ******************************************
> library("clhs")
> library("snow")
> a <- as.data.frame(replicate(1000, rnorm(20)))
> b <- as.data.frame(replicate(1000, rnorm(20)))
> c <- as.data.frame(replicate(1000, rnorm(20)))
> d <- as.data.frame(replicate(1000, rnorm(20)))
> abcd <- list(a, b, c, d)
> cl <- makeCluster(4)
> results <- parLapply(cl,
>    X = abcd,
>    FUN = function(i) {
>      clhs(x = i, size = round(nrow(i) / 5), iter = 2000, simple = FALSE)
>    },
> )
> stopCluster(cl)
> ******************************************
>
> Before running the last line, R is throwing an error: "Error in length(x) : 'x' is missing". Any ideas what I am doing wrong and how to improve?
>

Loading clhs on the primary does not automatically load it on the workers.  Try:

clusterEvalQ(cl, library(clhs))

--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com




More information about the R-help mailing list