[R] Problem with mclapply -- losing output/data
Patrick Connolly
p_connolly at slingshot.co.nz
Wed Mar 23 10:42:21 CET 2011
G'day Elizabeth,
For what it's worth, this is what I'd do were I in a position
like yours:
I would put a condition near the end of myfunc. that responded
when there was an indication that NULLs were to be returned into
your main list. I'd make an additional list with those bits
which would also collect sufficient information to work out which
values of x lead to that result. Then you'll be able to see
which ones give the problem.
Try running mclapply on only those bits and see if they all
respond the same way. If they do not, something very strange is
happening. But if those still behave the same way, then run with
only a single value of x in your call to mclapply.
I find the browser() function to be almost indispensable when working
out what's causing such problemss but to my knowledge, it won't work
when multiple cores are running in parallel. If you use a single
value of x, you can go back to using that trusted method. You might
also have to set nc.cores to 1, but I don't think so.
HTH
On Tue, 22-Mar-2011 at 01:13AM -0700, Elizabeth Purdom wrote:
> Hello,
> I am running large simulations, which unfortunately I can't really
> replicate here because the code is so extensive. I rely heavily on
> mclapply, but I realize that I'm losing data somewhere.
>
> There are two worrisome symptoms:
> 1) I am getting 'NULL' as a return value for some (but not all) elements
> of the output when I use mclapply, but not if I use lapply
> > tmp2[1:3] #output from lapply
> [[1]]
> 10000076 10000077
> 24 24
>
> [[2]]
> 10000076 10000077
> 119 119
>
> [[3]]
> 10000076
> 71
>
> > tmp[1:3] #output from mclapply
> [[1]]
> NULL
>
> [[2]]
> NULL
>
> [[3]]
> NULL
>
>
> 2) I am not getting back a list the same length as my input vector I'm
> parallelizing over. i.e. a command like this:
>
> tmp<-mclapply(x, FUN=myfunc, mc.cores=16)
>
> gives me back a list tmp which is not the same length as x (and so I'm
> getting all kinds of errors)
>
> This is extremely discouraging, because I've been using mclapply
> extensively at very many points on simulations that take a very long
> time to run, and now I'm wondering if what I'm getting is trustworthy. I
> don't think I could reasonably finish my results without mclapply, but I
> am thinking to cut it out except where it was absolutely necessary,
> time-wise. If anyone had any suggestions as to why this might be
> happening and how I can circumvent it (or test for it happening), I
> would greatly appreciate it.
>
> Thanks,
> Elizabeth Purdom
>
> > sessionInfo()
> R version 2.12.1 (2010-12-16)
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
> LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
> LC_PAPER=en_US.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] multicore_0.1-4 msm_1.0 gtools_2.6.2
> graph_1.28.0 Rsamtools_1.2.3
> [6] Biostrings_2.18.2 GenomicFeatures_1.2.3 GenomicRanges_1.2.3
> IRanges_1.8.9
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.10.0 biomaRt_2.6.0 BSgenome_1.18.3 DBI_0.2-5
> mvtnorm_0.9-96 RCurl_1.5-0
> [7] RSQLite_0.9-4 rtracklayer_1.10.6 splines_2.12.1
> survival_2.36-2 tools_2.12.1 XML_3.2-0
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
___ Patrick Connolly
{~._.~} Great minds discuss ideas
_( Y )_ Average minds discuss events
(:_~*~_:) Small minds discuss people
(_)-(_) ..... Eleanor Roosevelt
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
More information about the R-help
mailing list