[BioC] Statistical comparison of low replicate affy data

Mon Mar 1 02:13:49 MET 2004

If you really have good replicability, you should be able to use a 
gene-by-gene 2-sample t-test.

--Naomi

At 03:26 AM 2/19/2004, Matthew  Hannah wrote:
>Thanks for the responses.
>
>I used SAM by calling it on the exprsSet at was suggested. (It also worked
>on table made from a text file output from rma). Anyway when I did the
>analysis I got some 'interesting' results (see bottom of mail). Basically
>unpaired the data gave me 2500 genes delta > 2 with a FDR of 0.003, and
>paired (the U/T reps were conducted on the same plant batch) 4000 > 2 with
>FDR of 0.001. However if I permutate the input data (see ex. 3 below) then
>it just returns zeros. I guess this could be due to the coarseness of the
>comparisons as suggested but I'll just give a few more details of my data
>to see what people think.
>
>Basically the data is highly reproducable between biological replicates but
>there is a big 'treatment' effect (this is what we want!?). For example Rsq
>for within replicate x-y scatter plots of GCRMA data are 0.97 - 0.99, whilst
>for the 3 U-T comparisons the values are 0.92-0.93.
>
>So as I interpret things then as soon as you permutate the data you get very
>different data sets being mixed, massively increasing the variance and so few
>significant changes are detected, hence a very low FDR. If you input the data
>already permutated then some of the permutations of this data have loads of
>sig changes (as they represent the correct data order) and so FDR is huge and
>SAM returns all 0's.
>
>So where does this leave us, not using a test because the data is too 'good'
>seems abit strange. But equallly not knowing how reliable it is is also not
>good.
>
>Also on a more general note, when you get to this stage with so many changes
>(1 rep U-T comparison with GCRMA - 5000 1.5x and 2500 2x changes) is the data
>violating the assumption for the normalisation that most genes remain 
>unchanged?
>
>I'll investigate the limma package.
>
>Thanks
>Matt
>
> > cl = c(0,0,0,1,1,1)
> > rmasam <- sam(rma, cl)
>SAM Analysis for the two class unpaired case.
>
>s0 = 0.0695  (The 15 % quantile of the s values.)
>
>SAM Analysis for a set of delta:
>    delta    p0 false called   FDR
>1    0.2 0.723  9638  13270 0.525
>2    0.4 0.723  3951   9543 0.299
>3    0.6 0.723  1634   7480 0.158
>4    0.8 0.723   643   6068 0.077
>5    1.0 0.723   286   5155 0.040
>6    1.2 0.723   131   4394 0.022
>7    1.4 0.723    64   3764 0.012
>8    1.6 0.723    35   3259 0.008
>9    1.8 0.723    18   2846 0.005
>10   2.0 0.723    10   2478 0.003
>
> > cl = c(1,2,3,-1,-2,-3)
> > rmasamp <- sam(rma, cl)
>SAM Analysis for the two class paired case.
>
>s0 = 0.0733  (The 45 % quantile of the s values.)
>
>SAM Analysis for a set of delta:
>    delta   p0 false called   FDR
>1    0.2 0.52  9631  17275 0.290
>2    0.4 0.52  2378  13276 0.093
>3    0.6 0.52   695  10684 0.034
>4    0.8 0.52   257   8922 0.015
>5    1.0 0.52   127   7664 0.009
>6    1.2 0.52    53   6575 0.004
>7    1.4 0.52    28   5652 0.003
>8    1.6 0.52    14   4985 0.001
>9    1.8 0.52     9   4370 0.001
>10   2.0 0.52     5   3867 0.001
>
> > cl = c(0,1,0,1,0,1)
> > rmasamperm <- sam(rma, cl)
>SAM Analysis for the two class unpaired case.
>
>s0 = 0.0549  (The 5 % quantile of the s values.)
>
>SAM Analysis for a set of delta:
>    delta p0 false called FDR
>1    0.2  1     0      0   0
>2    0.4  1     0      0   0
>3    0.6  1     0      0   0
>4    0.8  1     0      0   0
>5    1.0  1     0      0   0
>6    1.2  1     0      0   0
>7    1.4  1     0      0   0
>8    1.6  1     0      0   0
>9    1.8  1     0      0   0
>10   2.0  1     0      0   0
>
>_______________________________________________
>Bioconductor mailing list
>Bioconductor at stat.math.ethz.ch
>https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111