[BioC] Limma B-statistics

Sun Oct 24 00:36:57 CEST 2004

> Date: Fri, 22 Oct 2004 11:06:15 +0100
> From: Brian Lane <bsl8096 at liverpool.ac.uk>
> Subject: [BioC] Limma B-statistics
> To: bioconductor at stat.math.ethz.ch
> Message-ID: <E12C817EC5C08FD1B9BF7823 at 182105-93607r.liv.ac.uk>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
> Hi,
> I need some help with the interpretation of B statistics generated by
> eBayes in the limma package.
>
> I want to compare gene expression in three groups of Affy samples. The
> probe level data was generated from .cel files (ReadAffy()), an exprSet
> object was generated using mas5 (scaled to 100) and a linear model fitted
> to the data using a design based on the three groups (6, 5, and 5 samples
> in each group, respectively). I have then made 3 contrasts to cover all
> possible comparisons within the data set, and generated empirical Bayes
> statistics using eBayes. I've then used classifyTestsF to classify each
> gene according to the contrasts.
>
> The results of all this are 23 significantly differentially expressed
> genes. The moderated t-values for all these 23 genes have p<0.01. However,
> all the B-values are <0 (average -3!). In fact, a volcano plot of log-odds
> and fold-change in the three contrasts show that all the B-values are
> negative.
>
> My understanding is that B<0 implies the gene is more likely to not be
> differentially expressed than to be differentially expressed. If this is
> the case, should I take the "significant genes" seriously? If not, is there
> any reason why the B-values should all be negative or does this simply
> reflect the fact that there is little evidence of differential expression
> in the data set as a whole?

Yes, this is supposed to indicate little evidence of differential expression.

I think the problem is likely to be that you have used classifyTestsF() without any adjustment for
multiple testing.  Please note that classifyTestsF() does not adjust for multiple testing across
probe sets.  You are supposed to compute a low p-value yourself (lower than 0.01!) to give
classifyTestsF() which reflects the number of probe sets.  See the Ecoli case study in the User's
Guide for example.

I have found that that this aspect of classifyTestsF() is often mis-understood, so I recommend
that you switch to decideTests() in the newer version of limma instead of classifyTestsF().

You might find the section "Statistics for Differential Expression" in the User's Guide helpful.

Gordon

> Regards,
> Brian Lane
> Dept of Haematology
> Liverpool University