[BioC] ANOVA, SAM and Limma

Thu Jun 24 22:35:15 CEST 2004

I just did a (very) small simulation study comparing one-way ANOVA with 
limma and SAM for various values of pi-0, and normal and t-distributed 
errors, 2 replicates per treatment, 22700 genes/array.  I did not replicate 
my simulations, so what I have to say here is going to be necessarily 
heuristic, but there were some lessons.

1. Gene-by-gene ANOVA is not as good as limma and SAM.
2. p-values are not as good as q-values.  (I used the "qvalue" package with 
limma.)
3. 2 replicates does not give you a whole lot of power, even when you 
"borrow strength" by using all the genes. Most of the differentially 
expressing genes were not "discovered".

The SAM d-value and limma F-value had rank correlation 99.7% for the 1 data 
set where I checked this.    SAM's q-value estimate is more conservative, 
but both are somewhat conservative.  Most of the differences in results 
appear to be differences in the estimated q-values, which were computed 
from the p-values in limma and directly from the permutations in SAM.  I 
cannot conclude from this which method is "better" but limma certainly uses 
a lot less memory and is much more convenient if you need specific 
contrasts.  On the other hand, SAM in Excel is very easy to use and seems 
to work just fine for ANOVA-like analysis.

Naomi S. Altman                                814-865-3791 (voice)
Associate Professor
Bioinformatics Consulting Center
Dept. of Statistics                              814-863-7114 (fax)
Penn State University                         814-865-1348 (Statistics)
University Park, PA 16802-2111