[BioC] Selecting p-value cutoffs for differential expression

Matthew Hannah Hannah at mpimp-golm.mpg.de
Tue Sep 7 17:20:22 CEST 2004


Hi,

Before the main question a minor point from the limma guide (as I'm
using it to compute the p-values). In the swirl example there is the
following sentence after the toptable is produced, are the stats not
independent because there are duplicate spots, or is there another
reason that I should be aware of?

"Beware that the Benjamini and Hochberg method used to control the false
discovery rate assumes independent statistics which we do not have here
(see help(p.adjust))."

Anyway, this aside. I'm looking to canvas opinion on how to select a
p-value cutoff for genes that are differentially expressed, hopefully
also allowing an assessment of false positive and negative rates aswell.
I've been playing around with the following, but none seems
satisfactory. Anyone have any input/experience on this topic?

1.Look at p-values for genes that are not called present in any of the
arrays, I suspect some are slipping through as there is still a peak of
low p-values.

2.Look at p-values for genes that have not been previously reported as
regulated by the treatment - but most previous work is poorly replicated
and has arbitary cutoffs such as 2 fold, so big peak of low p-values -
not as big as for those that have been previously reported though - any
ideas how to use this difference?

3.Use a set of control or house-keeping genes to define a lower cut-off
- unfortunately some do respond to the treatment (also confirmed in
previous work), so how to select appropriate genes...

4.As it seems that gcrma values have a bimodal distribution - any ideas
on how to utilise the lower peak (that presumably represents 'absent'
genes), to calculate a threshold.

5.Choose a fdr p-value of 0.01, 0.001 or 0.0001, assuming they are
approximately giving you corresponding false positive rates?

6. 'Decide' how many genes you want to be differentially expressed, and
then select one of the above criteria appropriately, this obviously
works as you'd like ;-) but is tricky to justify! 

Cheers,
Matt



More information about the Bioconductor mailing list