[R] Problem with mclapply -- losing output/data

Elizabeth Purdom epurdom at stat.berkeley.edu
Tue Mar 22 09:13:40 CET 2011

I am running large simulations, which unfortunately I can't really 
replicate here because the code is so extensive. I rely heavily on 
mclapply, but I realize that I'm losing data somewhere.

There are two worrisome symptoms:
1) I am getting 'NULL' as a return value for some (but not all) elements 
of the output when I use mclapply, but not if I use lapply
 > tmp2[1:3] #output from lapply
10000076 10000077
       24       24

10000076 10000077
      119      119


 > tmp[1:3] #output from mclapply



2) I am not getting back a list the same length as my input vector I'm 
parallelizing over. i.e. a command like this:

tmp<-mclapply(x, FUN=myfunc, mc.cores=16)

gives me back a list tmp which is not the same length as x (and so I'm 
getting all kinds of errors)

This is extremely discouraging, because I've been using mclapply 
extensively at very many points on simulations that take a very long 
time to run, and now I'm wondering if what I'm getting is trustworthy. I 
don't think I could reasonably finish my results without mclapply, but I 
am thinking to cut it out except where it was absolutely necessary, 
time-wise. If anyone had any suggestions as to why this might be 
happening and how I can circumvent it (or test for it happening), I 
would greatly appreciate it.

Elizabeth Purdom

 > sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)

  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               
  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8    
  [9] LC_ADDRESS=C               LC_TELEPHONE=C             

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] multicore_0.1-4       msm_1.0               gtools_2.6.2          
graph_1.28.0          Rsamtools_1.2.3
[6] Biostrings_2.18.2     GenomicFeatures_1.2.3 GenomicRanges_1.2.3   

loaded via a namespace (and not attached):
  [1] Biobase_2.10.0     biomaRt_2.6.0      BSgenome_1.18.3    
DBI_0.2-5          mvtnorm_0.9-96     RCurl_1.5-0
  [7] RSQLite_0.9-4      rtracklayer_1.10.6 splines_2.12.1     
survival_2.36-2    tools_2.12.1       XML_3.2-0

More information about the R-help mailing list