[BioC] Statistical comparison of low replicate affy data

Rafael A. Irizarry ririzarr at jhsph.edu
Wed Feb 18 15:04:17 MET 2004


On Wed, 18 Feb 2004, Matthew Hannah wrote:

> Hi,
> 
> I'm looking at how different analyses of affy data perform on a sample data 
> set of 2 conditions (untr vs. trt) with 3 biological reps (Ua, Ub, Uc vs. 
> Ta, Tb, Tc). I've computed RMA and GCRMA expression measures as standard and 
> so have 2 exprSets containing these values.
> 
> I've looked at fold changes and the treatment leads to many (1000-RMA, 1600 -
> GCRMA) pairwise 2x changes (Ua-Ta, Ub-Tb, Uc-Tc all >2). In order to estimate 
> the false positive rate I made pairwise comparisons within groups (Ua-Ub, Ua-Ub,
> Ub-Uc and same for T) and was suprised to see that with only 3 reps there were 
> very few genes that met the 2x criteria by chance (<5 - RMA, <10 GCRMA). What 
> are peoples views on estimating false positives in this way?

this is a complicated issue. 
one simple point is that if they are techinical reps you are
understimating FP rates what you would get from biological reps.

> 
> I now want to make some statistical comparisons of the data both paired and un-
> paired. I was thinking of making these comparisons first Ua-Ta.. (paired) and 
> then Uabc-Tabc (unpaired) and then permutate the data so to compare Ua,Tb,Uc - 
> Ta,Ub,Tc....etc in various combinations paired and un-paired. Would this provide
> reliable false positive rates?
> 
> I have looked into the BioC packages and guess I'll use a t-test with multest 
> correction, LPE (although this doesn't say it accepts RMA-type data?) and SAM, 
> EBAM & EBAM.WILC from siggenes. Are there others I should also consider?
> 
> My request for help is if people have experience of applying these tests to 
> affy data, specifically in the form of the RMA style exprSets my data is 
> currently in, could they possibly post or send the r-scripts they used. I've 
> obviously searched BioC and help but my attempts so far have returned errors 
> and I can't help but think I'm missing something obvious (need to get rid of 
> the affy ID's?) and obviously help would speed things up a great deal.
> 
> For example
> > cl <- c(0,0,0,1,1,1)
> > rmasam <- sam(esetrma, cl)
> returned - Error in var(v) : missing observations in cov/cor
> 
> > rmaebw <- ebam.wilc(esetrma, cl)
> returned - Error in 2^data : non-numeric argument to binary operator
> 
> Obviously if anyone is interested in what results I (eventually) obtain then 
> let me know.
> 
> Thanks,
> 
> Matt
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>



More information about the Bioconductor mailing list