[BioC] F-statistic in limma
echang4 at life.uiuc.edu
Fri Mar 3 23:20:35 CET 2006
Thank you very much for the many tips! I really do appreciate it. I do
have another question about p-values associated with F-statistics.
results<- decideTests(fit.all, method="nestedF", adj="none"
In the limma user guide, it says the following about F-statistic (Sec 18):
"... In a complex experiment with many contrasts, it may be desirable to
select genes firstly on the basis of their moderated F-statistics, and
subsequently to decide which of the individual contrasts are significant
for the selected genes. This cuts down on the number of tests which
need to be conducted and therefore on the amount of adjustment for
multiple testing. The function decideTests() with method="nestedF" is
able to conduct such tests."
If I do
results<- decideTests(fit.all, method="nestedF", adj="none")
I take this to mean that I can select the genes with either +1 or -1 in
results$Res.contrastcoefficient and then do multiple-testing correction
(using say q-value?) on the unadjusted p-values my contrast-of-interest
(which I would then choose some cutoff). I am wondering if application
of multiple-testing correction in this fashion would underestimate the
true FDR (rather than running q-value of on the entire set of genes
regardless of how they contribute to the F-statistic?)
Thank you for your time,
James W. MacDonald wrote:
> echang4 at life.uiuc.edu wrote:
>> Hi Bioconductor users,
>> I am having trouble understanding how multiple-testing adjustment is
>> in limma (specifically the decideTest). I am really confused about the
>> meaning between moderated F p-value and the adjusted p-value.
>> If I try two different "flavors" of decideTest (e.g. nestedF and
>> I can see that the results are different.... but is it the p-value
>> that is
>> adjusted or the F-statistic is adjusted?
> The statistics themselves are never adjusted in decideTests(), only
> the p-values.
> The difference is in how the p-values are adjusted. For the 'global'
> option, all the contrasts are considered to be independent, and the
> p-values are adjusted as if you just had a bunch of independent t-tests.
> The nestedF option is a bit more complicated. First, a bit of
> background. The F-statistic is used to determine if there are any
> differences between the samples, but it doesn't tell you which
> sample(s) are different. You have to fit contrasts to find out which
> sample(s) are different.
> So the idea with the nestedF is to adjust the p-values associated with
> the F-test to find which genes are differentially expressed in at
> least one sample. Now we have a list of genes that are differentially
> expressed, but we don't know for which sample(s) that may be true. The
> t-statistics associated with the contrasts are then inspected and the
> largest one (in absolute value) is considered significant. Now, there
> may be other contrasts that are significant as well, so the largest
> t-statistic is set to the same absolute value as the second largest
> t-statistic, and the F-statistic is calculated again. If the
> F-statistic is still significant, the second largest contrast is
> considered significant. This procedure is continued until the
> F-statistic is no longer significant.
> The basic reasoning here is that the largest t-statistic for a set of
> contrasts is significant if the overall F-statistic is significant. By
> following this step-wise procedure, we can determine which contrasts
> are contributing to the overall significance of the F-statistic.
More information about the Bioconductor