[BioC] set.seed() in gcrma

Henrik Bengtsson hb at stat.berkeley.edu
Wed Feb 13 20:18:01 CET 2008


On Feb 13, 2008 10:46 AM, Kasper Daniel Hansen
<khansen at stat.berkeley.edu> wrote:
> In my opinion gcrma should _not_ play with the seed. That is
> something for the user to decide.

It's a trade-off between providing reproducible examples and adding to
the overall confusion.  I think it is important to be able to generate
identical results for identical calls, and since it is not clear to
most users that the gcRMA parameters are estimated based on a sample
one I can see how the current solution avoids the problem of having
ask people to set the seed explicitly before calling gcrma() [also I
have a hard time to keep track on what is going on in all methods].

One solution would be to move the value of seed to be an argument with
a default value and if NA/NULL the seed is *not* set.  That would
provide a backward-compatible upgrade to gcrma().

Comments?

Henrik

>
> Kasper
>
>
> On Feb 13, 2008, at 9:49 AM, Zhijin Wu wrote:
>
> > Bill
> > Thank you for that suggestion. We will edit gcrma to restore the
> > seed in
> > the next release.
> > Zhijin
> > Henrik Bengtsson wrote:
> >> Hi,
> >>
> >> I think you have spotted something very important, and I agree that
> >> the random seed should be reset by any function that set it in order
> >> to get reproducible samples.  We've been working with a port/wrapper
> >> for gcrma() but this never struck us - thanks for bringing it up.
> >>
> >> FYI, I've just sent a question to R-devel, as it is more appropriate
> >> there, on how to best push the random generator back to the state it
> >> was before calling set.seed().
> >>
> >> Best,
> >>
> >> Henrik
> >>
> >> On Feb 13, 2008 7:53 AM, William T Barry <bill.barry at duke.edu> wrote:
> >>> Hello,
> >>>
> >>> While recently using the gcrma package, I noticed some strange
> >>> behavior
> >>> that I want to report.  I was applying the pre-processing
> >>> algorithm (using
> >>> default settings) inside a resampling scheme and noticed that
> >>> subsequent
> >>> calls of sample() were returning a constant vector of values.
> >>> Example
> >>> output shown below:
> >>>
> >>>> temp <- gcrma(celdat[,1:2])
> >>> Adjusting for optical effect..Done.
> >>> Computing affinities.Done.
> >>> Adjusting for non-specific binding..Done.
> >>> Normalizing
> >>> Calculating Expression
> >>>
> >>>> sample(1:10000,3)
> >>> [1] 7030 5473 8109
> >>>
> >>>> temp <- gcrma(celdat[,1:2])
> >>> Adjusting for optical effect..Done.
> >>> Computing affinities.Done.
> >>> Adjusting for non-specific binding..Done.
> >>> Normalizing
> >>> Calculating Expression
> >>>
> >>>> sample(1:10000,3)
> >>> [1] 7030 5473 8109
> >>>
> >>>> sessionInfo()
> >>> R version 2.6.1 (2007-11-26)
> >>> x86_64-redhat-linux-gnu
> >>>
> >>> locale:
> >>> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_
> >>> US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en
> >>> _US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US
> >>> .UTF-8;LC_IDENTIFICATION=C
> >>>
> >>> attached base packages:
> >>> [1] splines   tools     stats     graphics  grDevices utils
> >>> datasets
> >>> [8] methods   base
> >>>
> >>> other attached packages:
> >>> [1] hu6800probe_2.0.0    hu6800cdf_2.0.0      gcrma_2.10.0
> >>> [4] matchprobes_1.10.0   affy_1.16.0          preprocessCore_1.0.0
> >>> [7] affyio_1.6.1         Biobase_1.16.2
> >>>
> >>> loaded via a namespace (and not attached):
> >>> [1] rcompgen_0.1-17
> >>>
> >>> In looking at the gcrma functions, it appears that gcrma.engine2
> >>> () is
> >>> setting the seed without saving and restoring .Random.seed .
> >>> This was
> >>> easily corrected in the external loop of the resampling scheme,
> >>> but I was
> >>> unsure whether this would be considered a 'bug' in gcrma.
> >>>
> >>> Kind regards,
> >>> Bill
> >>>
> >>>
> >>> -------------------------------
> >>> William T. Barry, Ph.D.
> >>> Dept. of Biostatistics & Bioinformatics, DUMC
> >>> Duke Institute for Genome Science & Policy
> >>> Hock Plaza, Room 8030
> >>> DUMC Box 2717
> >>> Durham, NC 27710
> >>>
> >>> Phone: 919-681-5047
> >>> Fax: 919-668-9335
> >>>
> >>>
> >>>         [[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> Search the archives: http://news.gmane.org/
> >>> gmane.science.biology.informatics.conductor
> >>>
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives: http://news.gmane.org/
> >> gmane.science.biology.informatics.conductor
> >
> >
> > --
> > -------------------------------------------
> > Zhijin (Jean) Wu
> > Assistant Professor of Biostatistics
> > Brown University, Box G-S121
> > Providence, RI  02912
> >
> > Tel: 401 863 1230
> > Fax: 401 863 9182
> > http://stat.brown.edu/~zwu
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/
> > gmane.science.biology.informatics.conductor
>
>



More information about the Bioconductor mailing list