[BioC] Limma: correct calculation of B statistics (log odds)

Sat Apr 22 07:04:01 CEST 2006

At 12:23 AM 22/04/2006, J.delasHeras at ed.ac.uk wrote:
>But you can't use the B statistic to decide a suitable cut off for 
>your experiment unless the proportion of DE genes has been estimated.

Or if you think the preset proportion is 'a priori reasonable.

>Since the adjusted P Values rank the genes in the same order as B... 
>why use B at all?

The B-statistic is kept in limma for two reasons. One is to keep the 
connection with the Lonnstedt and Speed (2002) paper. The other is 
because the B-statistics and the p-values theoretically can order the 
genes differently if there are lots of NA observations or zero 
weights in the data. In practice, the rankings tend to remain the 
same even in the presence of some NA observations.

>This is very true...
>So, in practical terms, it's probably best to stick to P values when 
>I need to make cut offs, and use B if I think a volcano plot 
>ilustrates better the point I want to make about the DE genes in a 
>particular experiment, but without giving too much importance to the 
>actual value (or use the convest function to estimate the proportion 
>of DE genes, but bearing in mind that -as you point out above- the 
>true proportion is likely to be larger, just beyond our limits of detection)...
>
>does that sound about right?

Right, except for one point. The convest() function really will try 
to estimate the true proportion, including all the genes with 
microscopic fold changes which individually are beyond detection. So 
I would put it the other way around: the proportion of biologically 
interesting DE genes is likely to be smaller.

Best wishes
Gordon