[BioC] siggenes permutation count problem
paul.boutros@utoronto.ca
paul.boutros at utoronto.ca
Sat Jan 7 19:48:35 CET 2006
Hi Jim (and others who replied off-list),
Thank you -- when I saw the term "complete permutations", it didn't register in
my head that it really meant combinations.
Paul
Quoting "James W. MacDonald" <jmacdon at med.umich.edu>:
> paul.boutros at utoronto.ca wrote:
> > Hello,
> >
> > I'm having some troubles interpreting how/why siggenes performed a certain
>
> > number of permutations on my dataset. This is an affy dataset that was
> > normalized by:
> >
> > data <- ReadAffy(filenames=cel.files, phenoData="phenodata.txt");
> > eset <- expresso(data, normalize.method="constant",
> bgcorrect.method="none",
> > pmcorrect.method="mas", summary.method="avgdiff");
> >
> > I realize that the normalization is a bit unusual: this study is actually
> > testing a range of normalization methods. This is a two-class experiment
> with
> > 3 arrays in each group:
> >
> >
> >>eset;
> >
> > Expression Set (exprSet) with
> > 22690 genes
> > 6 samples
> > phenoData object with 1 variables and 6 cases
> > varLabels
> > Group: read from file
> >
> >>design;
> >
> > [1] 1 1 0 1 0 0
> >
> >
> > So to do a SAM-like analysis I used:
> > SAM.data <- sam(data=eset, cl=design, var.equal=FALSE, B=1000);
> >
> > And I expected there to be 6! = 720 total possible permutations. So I was
>
> > surprised to get this output:
> >
> >>SAM.data <- sam(data=eset, cl=design, var.equal=FALSE, B=1000);
> >
> >
> > We're doing 20 complete permutations
> >
> >
> > Why does siggenes think there are only 20 complete permutations to be used?
>
> > Have I done something wrong, or is my understanding of how the permutations
> are
> > done in error?
>
> It's a combination of incorrect terminology and (possibly) a
> misunderstanding on your part. First, there *are* 720 possible
> permutations, but we don't care about the ordering within each group
> since we are simply comparing group means. What we really want here are
> combinations, and there are only 20 combinations when you have 6 samples
> and you are choosing three for each group (see ?choose). If you did all
> 720 permutations it would result in only 20 unique t-statistics with a
> lot of replication.
>
> This terminology is a hold over from SAM, which AFAIK really did do the
> permutations rather than combinations. However, this is very wasteful of
> computing time especially when the number of replicates gets large, so
> siggenes rightly does the combinations and abuses terminology by calling
> them 'complete permutations'.
>
> Best,
>
> Jim
>
>
> >
> > This is R 2.2.1 and siggenes 1.4.0 on WinXP.
> >
> > Paul
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
>
>
> --
> James W. MacDonald
> University of Michigan
> Affymetrix and cDNA Microarray Core
> 1500 E Medical Center Drive
> Ann Arbor MI 48109
> 734-647-5623
>
>
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be
> used for urgent or sensitive issues.
>
More information about the Bioconductor
mailing list