[Rd] parallel performance inline code vs using function ?

Benoit Thieurmel bt at datak.fr
Mon Jul 27 15:16:32 CEST 2015


Hi,

I really try to understand why working with parallel package, code seems to
be slower using inside a function... for example :

# data
don <- lapply(1:150, function(x){data.frame(a = rnorm(100000), b =
rnorm(100000))})

# inline test
t0 <- Sys.time()

require(parallel)
cl <- makeCluster(4)
res <- parLapplyLB(cl, don, function(x){1})
stopCluster(cl)

Sys.time()-t0 # 3.5 sec, each thread up to 90 Mo

# using function
parF <- function(data){

  require(parallel)
  cl <- makeCluster(4)

  result <- parLapply(cl, data, function(x){1})

  stopCluster(cl)
}

system.time(res2 <- parF(don)) # 9.5 sec, each thread up to 320 Mo ...!


It's seems that, using inside a function :

   - is 3x slower...
   - more data is loaded into each thread...!

Thanks.
-- 


Benoit Thieurmel  +33 6 69 04 06 11 10 place de la Madeleine - 75008 Paris

	[[alternative HTML version deleted]]



More information about the R-devel mailing list