[BioC] Limma - decideTests - separate/nestedF questions

Sun Apr 8 01:55:33 CEST 2007

Dear Misha,

> Date: Fri, 6 Apr 2007 13:00:00 +0100
> From: "Misha Kapushesky" <ostolop at ebi.ac.uk>
> Subject: [BioC] Limma - decideTests - separate/nestedF questions
> To: bioconductor at stat.math.ethz.ch
>
> Hi all,
>
> Several questions concerning intricacies of limma's decideTests() have
> emerged from a discussion with some colleagues here at the EBI. Perhaps
> someone can enlighten us.
>
> 1. The docs (and previously on this list) say that nestedF is especially
> powerful in identifying genes that are diff. expressed in many contrasts,
> and less powerful for ones diff. expressed in only one contrast.

I have said this on the list, but I don't think the limma documentation says so.

> On the
> other hand we also read that with nestedF "at least one contrast will be
> classified as significant if and only if the overall F-statistic is
> significant" - meaning that it should pick up genes diff. expressed in only
> one contrast, shouldn't it? Why less powerful?

You can verify for yourself in ordinary ANOVA that F-tests are less powerful than t-tests for
detecting sparse effects.  This because the null effects tend to dilute the truly different
effects.

Suppose for example that you're testing 10 contrasts with a common residual standard deviation and
p=0.001.  For simplicity, suppose the residual degrees of freedom is large, so I can use normal
calculations instead of t distribution.  Suppose that only one contrast is truly different.  For
all the other contrasts, the null hypothesis of no difference is true.

If you do individual t-tests, you need a t-statistic of 3.9 to be significant after Bonferroni
adjustment for multiple testing.

Now consider the F-test.  The typical size of a t-statistic for a null contrasts is t^2=1, so the
F-statistic will typically be about F=(t^2+9)/10 where t is the t-statistic for the truly
different contrast.  To be significant at 0.001 you need F=2.96, which implies t=4.54.  In other
words, the t-statistic needs to be larger to stand out as significant in an F-statistic than it
does as an individual test.

On the other hand, if several contrasts were truly different, then the F-test would be more
powerful than the t-tests.

> 2. Say we have 5 contrasts adn we run decideTests() both with "separate" and
> with "nestedF" methods and are comparing the results. Suppose some gene is
> marked as differentially expressed in 2 out of 5 "separate" contrasts, but
> is significant in only 1 out of 5 "nestedF" ones. What's the best way to
> interpret such a result? Shall we say this gene is not sufficiently variable
> overall? And vice versa, if it's marked significant in 2 columns in
> "nestedF" but only in 1 in "separate" results, does "nestedF" overestimate
> its significance, or is it that "separate" failed to pick it up in some
> contrast?

You're starting from the assumption that one of the methods is correct and the other is wrong, and
that there is a way to figure out which one is correct in each case.  This is not the right way to
think about it.  It is perfectly possible for the two methods to give different results and for
both to be correct.  (Otherwise limma wouldn't offer more than one method.)  It is not even
possible to say that one method is consistently more stringent than the other.  If that was so, it
would be possible to consolidate the two methods into one.

method="separate" and method="nestedF" do quite different things.  "separate" controls the FDR on
a per-contrast basis only.  It does not control the FDR globally across all contrasts.  "nestedF"
controls the FDR on a per-gene basis only.  It does not offer any formal FDR control at the
contrast-level.

In practice you will find that "separate" gives more significant results when there are other
significant results for the same contrast, i.e., significant results beget other results down the
same contrast.  Hence you will find that the t-statistic threshold for significance varies between
contrasts.  On the other hand, "nestedF" will give more significant results for genes for which
there other significant results, i.e., significant results beget other results for the same gene. 
You will find that the t-statistic threshold for significance is much less for genes with many
significant contrasts.

The bottom line is that you should not attempt to mix-n-match the different methods.  You should
decide in advance what sort of errors you want to control, choose the appropriate multiple testing
method, and stick to it.  I personally use nestedF when I'm most interested in finding genes which
respond to more than one treatment.  Otherwise I would use other methods.

I wish I could make this simpler, but multiple testing in two dimensions (genes and contrasts) is
intrinsically subtle.

> 3. Does it make sense to rank genes in order of significance of differential
> expression by looking at how many columns of "nestedF" results have non-zero
> values for each gene?

No.  The F-statistic orders the genes in terms of significance.

> If a gene is classified by "nestedF" as diff.
> expressed in only 1 contrast, is it still one of the most variable genes
> across all contrasts?

You need to define what you mean by "valuable".

> 4. Say we have an experiment with treatment A, treatment B and a compound
> treatment A+B (not time course), is it legitimate to apply "nestedF" to all
> pair-wise contrasts to identify the most responsive overall genes, but then
> to look at the results matrix separately to say which treatments contributed
> most to this overall significance? Or is it more sensible to do it with the
> method "separate"?

Both are legitimate methods, and both could be sensible in your case depending on what you know
about your treatments and what sort of effects you're most interesting in finding.

To throw in another consideration, why not use method="global", which is by far the simplest
method, using the same threshold for all genes and all contrasts, and provides global FDR control
in most cases.

Best wishes
Gordon

> Many thanks in advance for any answers to these questions!
>
> --Misha K. and colleagues
> Microarray Informatics Team, EBI