[R] problem with the use of parallel foreach
Vivek Sutradhara
viveksutra at gmail.com
Tue Dec 22 13:41:01 CET 2015
Hi,
I am having a problem with the use of the foreach package. It is strange
that my code works when i use the %do% function but not with %dopar%.
Let me explain. I am new to parallel and foreach packages. I have data in
the form of very large files, and they are in the form of data tables. I
have saved them as rds files, for taking advantage of the compression
capability.
I will try to make reproducible example as follows :
setwd('C:/Rtrials/parallelTrials')
library(parallel)
#no_ofCores<-detectCores()
library(doSNOW)
cl <- makeCluster(2, type="SOCK")
registerDoSNOW(cl)
library(data.table)
# example data
mt<-data.table(mtcars)
#split mtcars into 4 data.tables and save into 4 rds files (to mimic my
file structure)
nlow<-1
for (i in 1:4) {
filei<-paste0('mt',i,'.rds')
nhigh<-i*8
mti<-mt[nlow:nhigh]
saveRDS(mti,file=filei)
nlow<-i*8+1
}
# read the files in parallel and aggregate
mt6<-foreach(j=1:4,.combine='rbind') %dopar% { # works with %do%
filej<-paste0('mt',j,'.rds')
mtj<-readRDS(filej)
mtj[cyl == 6]
}
stopCluster(cl)
I get the following error message when using %dopar% :
Error in { : task 1 failed - "object 'cyl' not found"
When I change the %dopar% command to %do%, I do not get an error
message. What is the problem in the use of %dopar%?
I would appreciate help in troubleshooting.
Instead of the foreach loop, I tried the same with a for loop. After
saving the aggregated result, I had to delete the table from the
currently read file, do garbage collection and then read in a new
file. Something like the following :
dtAll<-mt[0]
for (j in 1:25) {
filetxt<-paste0('mt',j,'.rds')
dtj<-readRDS(filetext)
dtAll<-rbind[list(dtAll,dtj)]
rm(dtj);gc()
}
How is garbage collection handled in parallel computing? With the .combine
= 'rbind' option, this may not be necessary. Could somebody comment on
this? Would it be better to use the 'rbindlist' option instead of 'rbind'?
First, I would like to know what my problem with %dopar% is.
Thanks for any help that I can get.
Vivek
[[alternative HTML version deleted]]
More information about the R-help
mailing list