[Bioc-devel] reproducible with mclapply?

Yu, Guangchuang gcyu at connect.hku.hk
Thu Jun 4 04:07:30 CEST 2015


There is one possible solution posted in
http://stackoverflow.com/questions/30610375/how-to-run-permutations-using-mclapply-in-a-reproducible-way-regardless-of-numbe/30627984#30627984
.

As Kasper suggested, it's not a proper way to use set.seed inside a package.

I suggest using a parameter for example seed=FALSE to disable the set.seed
and if user want the result reproducible, e.g. in demonstration, set
seed=TRUE explicitly and set.seed will be run inside the function.

Bests,
Guangchuang

On Wed, Jun 3, 2015 at 8:42 PM, Kasper Daniel Hansen <
kasperdanielhansen at gmail.com> wrote:

> For this situation, generate the permutation indexes outside of the
> mclapply, and the do mclapply over a list with the indices.
>
> And btw., please don't use set.seed inside a package; that control should
> completely be left to the user.
>
> Best,
> Kasper
>
> On Wed, Jun 3, 2015 at 7:08 AM, Vincent Carey <stvjc at channing.harvard.edu>
> wrote:
>
>> This document indicates how to achieve reproducibility independent of the
>> underlying physical environment.
>>
>> http://cran.r-project.org/web/packages/doRNG/vignettes/doRNG.pdf
>>
>> Let me know if that satisfies the question.
>>
>> On Wed, Jun 3, 2015 at 5:32 AM, Yu, Guangchuang <gcyu at connect.hku.hk>
>> wrote:
>>
>> > Der Vincent,
>> >
>> > RNGkind("L'Ecuyer-CMRG") works as using mc.set.seed=FALSE.
>> >
>> > When mc.cores changes, the output is not reproducible.
>> >
>> > I think this issue is also of concern within the Bioconductor community
>> as parallel version of permutation test is commonly used now.
>> >
>> > Best Regards,
>> >
>> > Guangchuang
>> >
>> >
>> >
>> > On Wed, Jun 3, 2015 at 5:17 PM, Vincent Carey <
>> stvjc at channing.harvard.edu>
>> > wrote:
>> >
>> >> Hi, this question belongs on R-help, but perhaps
>> >>
>> >>
>> https://stat.ethz.ch/R-manual/R-devel/library/parallel/html/RngStream.html
>> >>
>> >> will be useful.
>> >>
>> >> Best regards
>> >>
>> >> On Wed, Jun 3, 2015 at 3:11 AM, Yu, Guangchuang <gcyu at connect.hku.hk>
>> >> wrote:
>> >>
>> >>> Dear all,
>> >>>
>> >>> I have an issue of setting seed value when using parallel package.
>> >>>
>> >>> > library("parallel")
>> >>> > library("digest")
>> >>> >
>> >>> > set.seed(0)
>> >>> > m <- mclapply(1:10, function(x) sample(1:10),
>> >>> +               mc.cores=2)
>> >>> > digest(m, 'crc32')
>> >>> [1] "4827c80c"
>> >>> >
>> >>> > set.seed(0)
>> >>> > m <- mclapply(1:10, function(x) sample(1:10),
>> >>> +               mc.cores=2)
>> >>> > digest(m, 'crc32')
>> >>> [1] "e95b9134"
>> >>>
>> >>> By default, set.seed() will be ignored since mclapply will set the
>> seed
>> >>> internally.
>> >>>
>> >>> If we use mc.set.seed=FALSE to disable this feature. It works as
>> >>> indicated
>> >>> below:
>> >>>
>> >>> > set.seed(0)
>> >>> > m <- mclapply(1:10, function(x) sample(1:10),
>> >>> +               mc.cores=2, mc.set.seed = FALSE)
>> >>> > digest(m, 'crc32')
>> >>> [1] "6bbada78"
>> >>> >
>> >>> > set.seed(0)
>> >>> > m <- mclapply(1:10, function(x) sample(1:10),
>> >>> +               mc.cores=2, mc.set.seed = FALSE)
>> >>> > digest(m, 'crc32')
>> >>> [1] "6bbada78"
>> >>>
>> >>> The problems is that the results are also depending on the number of
>> >>> cores.
>> >>>
>> >>> > set.seed(0)
>> >>> > m <- mclapply(1:10, function(x) sample(1:10),
>> >>> +               mc.cores=4, mc.set.seed = FALSE)
>> >>> > digest(m, 'crc32')
>> >>> [1] "a22e0aab"
>> >>>
>> >>>
>> >>> Any idea?
>> >>>
>> >>> Best Regards,
>> >>> Guangchuang
>> >>> --
>> >>> --~--~---------~--~----~------------~-------~--~----~
>> >>> Guangchuang Yu, PhD Candidate
>> >>> State Key Laboratory of Emerging Infectious Diseases
>> >>> School of Public Health
>> >>> The University of Hong Kong
>> >>> Hong Kong SAR, China
>> >>> www: http://ygc.name
>> >>> -~----------~----~----~----~------~----~------~--~---
>> >>>
>> >>>         [[alternative HTML version deleted]]
>> >>>
>> >>> _______________________________________________
>> >>> Bioc-devel at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>> >>>
>> >>
>> >>
>> >
>> >
>> > --
>> > --~--~---------~--~----~------------~-------~--~----~
>> > Guangchuang Yu, PhD Candidate
>> > State Key Laboratory of Emerging Infectious Diseases
>> > School of Public Health
>> > The University of Hong Kong
>> > Hong Kong SAR, China
>> > www: http://ygc.name
>> > -~----------~----~----~----~------~----~------~--~---
>> >
>>
>>         [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>


-- 
--~--~---------~--~----~------------~-------~--~----~
Guangchuang Yu, PhD Candidate
State Key Laboratory of Emerging Infectious Diseases
School of Public Health
The University of Hong Kong
Hong Kong SAR, China
www: http://ygc.name
-~----------~----~----~----~------~----~------~--~---

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list