[BioC] SAM siggenes number of permutations
Holger Schwender
holger.schw at gmx.de
Thu Jan 31 13:30:02 CET 2008
Hi Claus,
yes, you are correct (and not only since wikipedia says so). Didn't pay attention to the fact that the vector of the actual class labels is one of the permutations of it if you do complete permutations.
Sorry for that
Holger
-------- Original-Nachricht --------
> Datum: Thu, 31 Jan 2008 11:52:41 +0000
> Von: Claus-Dieter Mayer <claus at bioss.ac.uk>
> An: Holger Schwender <holger.schw at gmx.de>
> CC: olivier.armant at itg.fzk.de, bioconductor at stat.math.ethz.ch
> Betreff: Re: [BioC] SAM siggenes number of permutations
> Hi Holger,
>
> I am not a SAM-expert, so I accept the second point you mention. My
> comment rather referred to a standard permutation test. I am not sure
> whether I agree with your first point though. In my understanding a
> p-value is the probability of "obtaining a result at least as extreme
> as a given data point" (Wikipedia agrees with me on this), i.e it is the
> probability of being ">=" the observed value. So if there are only 20
> possible ways to split the data up into the two groups, one of them will
> lead to the observed value, so the p-value will be 1/20 at least.
> Replacing the ">=" by a ">" in the calculation of the p-value will give
> the wrong result ( at least if the number of permutations is small).
> In general exact zeros should not occur for p-values in real-life
> situations (mathematically you can of course construct situations, where
> certain values are impossible to be obtained under the null hypothesis),
> the zeros you will find in output occasionally are just extremely small
> numbers, where the non-zero entry comes at decimal point that cannot be
> displayed.
>
> Best Wishes
>
> Claus
>
> Holger Schwender wrote:
> > Hi Claus,
> >
> > this is not totally correct. If none of the permuted test scores is
> larger than the actual test score, then your p-value will be 0.
> >
> > Moreover, SAM uses not just the B permuted test scores of a particular
> gene to compute its p-values, but all mB permuted test scores of all m genes
> such that the p-value of a gene is given by i/mB instead of i/B, where i
> is the number of more extreme permuted test scores and B is the number of
> permutations.
> >
> > Best,
> > Holger
> >
> >
> > -------- Original-Nachricht --------
> >
> >> Datum: Wed, 30 Jan 2008 15:27:25 +0000
> >> Von: Claus-Dieter Mayer <claus at bioss.ac.uk>
> >> An: olivier armant <olivier.armant at itg.fzk.de>
> >> CC: bioconductor at stat.math.ethz.ch
> >> Betreff: Re: [BioC] SAM siggenes number of permutations
> >>
> >
> >
> >> Dear Oliver,
> >>
> >> my guess is that you have 2 groups with 3 samples each in which case
> >> there are only 20 different possible permutations and the software is
> >> clever enough to realise that. In that case the calculation is exact,
> >> but you will not find anything significant as the smallest possible
> >> p-value is 5% (1/20) for a one-sided and 10% (2/20) for a two-sided
> >> test. The problem of how large groupsizes must be in order to apply
> >> permutation tests was discussed on this list some time ago, have a look
> >> at
> https://stat.ethz.ch/pipermail/bioconductor/2007-November/020110.html.
> >>
> >> Hope that helps,
> >>
> >> Claus
> >>
> >> olivier armant wrote:
> >>
> >>> Dear all,
> >>>
> >>> I try to do SAM on my data using siggenes on R 2.4.1 (I am a beginner)
> >>>
> >>> The function I use is (after creating the vector)
> >>> sam.out<-sam(data.gcrma, sam.c1, B=100, var.equal=TRUE, Set med=TRUE)
> >>>
> >>> It seems to work well but I get allways the message:
> >>> number of effective permutations=20
> >>>
> >>> Does it means that only 20 permutations were done, werheas I ask for
> 100
> >>>
> >> permutations with the function B=100??
> >>
> >>> I read in the SAM excel package from standford that a precise FDR
> >>>
> >> requires 1000 permutations!!!What do you think??
> >>
> >>> Help would be welcome
> >>>
> >>>
> >>> Olivier ARMANT PhD.
> >>>
> >>> Institute of Toxicology and Genetics
> >>> Forschungszentrum Karlsruhe
> >>> Hermann-von-Helmholtz-Platz 1
> >>> D-76344 Eggenstein-Leopoldshafen
> >>> Germany
> >>>
> >>> tel: +49-7247-82-2560
> >>> fax: +49-7247-82-3354
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> Search the archives:
> >>>
> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >>>
> >>>
> >>>
> >>>
> >>> Click link below to report this email as spam.
> >>>
> >>>
> >>
> https://www.mailcontrol.com/sr/3DY9iYaP4!7r39w1EFqnMqyXCXdO4FUjsVoyh6aS5N4FEmP!1HRAPmogM3OjcxjD93Syur5W2CZtunTQgwTuP7V!!KZuwoZSVAucmrR2rgQOGNiaVM6niaGOzmDM1kiNIGdfj1S974ZFrjONfMkOumM3VVLQBeUfyoE8wlh1VA3AcEiVY62mDkUBARsCH4ulx40V!CB9C3v7YvmL6!0DaFxrVhykbxl2
> >>
> >>
> >>>
> >>>
> >> --
> >>
> ***********************************************************************************
> >> Dr Claus-D. Mayer | http://www.bioss.ac.uk
> >> Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk
> >> Rowett Research Institute | Telephone: +44 (0) 1224 716652
> >> Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives:
> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >
> >
>
> --
> ***********************************************************************************
> Dr Claus-D. Mayer | http://www.bioss.ac.uk
> Biomathematics & Statistics Scotland | email: claus at bioss.ac.uk
> Rowett Research Institute | Telephone: +44 (0) 1224 716652
> Aberdeen AB21 9SB, Scotland, UK. | Fax: +44 (0) 1224 715349
> ***********************************************************************************
>
--
More information about the Bioconductor
mailing list