[BioC] Limma p-value distributions, false +ve/-ve etc...
Matthew Hannah
Hannah at mpimp-golm.mpg.de
Fri Oct 1 15:38:17 CEST 2004
Hi,
I guess the simple question is would you expect or have you seen a
'standard' distribution of p-values for a treated-untreated comparison
(3 reps) after the eBayes procedure in Limma?
Expressed in my usual 'comprehensive' style ;-)
Following on from a previous thread I've started to look more into the
p.value distributions to get an idea of false +ve and -ve rates. I
understand that p-values should be approx. uniformly distributed as they
approach 1. The following paper uses a mixture of beta and uniform
distributions to model false/true +ve and -ve rates.
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&lis
t_uids=12835267&dopt=Abstract
Code that can be pasted into R is found here -
http://www.stjuderesearch.org/depts/biostats/BUM/index.html
Has anyone done similar? And is this approach valid using the eBayes
moderated stats of limma?
This approach 'appears' ok if you have alot of replicates (my 9
genotypes x 2 treatments x 3 reps example again) ie: the p-values show
the expected distribution. However, if you drop down to a single
genotype and therefore a 3 x 3 comparison the p-values aren't well
distributed (slightly more 0.55-0.8, slightly less 0.85-1). I'm worried
this means that maybe the tests assumptions aren't met, but is there a
way of formally testing this? At the same time I suppose it's not too
suprising that with low replications the distribution is not ideal -
hence my question of other peoples experience with p-value distributions
from limma). Incidently limma p-values have a better distribution than a
paired or equal varience t-test which I guess is a good sign.
I had a quick look at someones elses data for a 3 x 3 comparison they
had an even 'worse' p-value distribution. Theirs had a large secondary
peak from 0.8-1. Again I assume this could mean assumptions were not met
- but can anyone explain any possible causes? The reasons I can think of
are that their experiment had more similar biological replica and their
treatment could cause a large number of co-regulated changes. However,
not being a statistician, I can't relate this to the p-value
distribution.
Thanks in advance for any feedback.
Matt
More information about the Bioconductor
mailing list