[BioC] Limma B-statistics

Mon Oct 25 10:59:43 CEST 2004

I have had the same problem as described below, but have not applied the functions classifyTestsF() or decideTests()

Here are my commands. They are almost exactly as in the the limma-tutorial

festuca.norm0 is an object of class marrayNorm.
The experiment is a dye.swop experiment with 2 arrays (one with each labelling). There are two different types of samples on the array, and the goal is to find the differential expressed genes. 
There are 5 replicateded spots of each gene on each array (556 genes in total). Only spots with printed genes are included in the analysis. 
gene is a logical vector for if a spot is a gene or not.

f.cor<-duplicateCorrelation(maM(festuca.norm0)[gene,],design=c(1,-1),ndups=5)
fit <- lmFit(festuca.norm0[gene,],design=c(1,-1),ndups=5,correlation=f.cor$cor)
eb <- eBayes(fit)
toptable(number = 25,genelist = gnames,fit = fit, eb = eb, adjust = "fdr")
plot(fit$coef,eb$lods,xlab="Log2 Fold Change",ylab="Log Odds",pch=16,cex=0.2)
I'm a beginner with R, Bioconductor (and microarrays), so I hope any answers will give simple explanations/comments

------------------------------------------------------------------------------
Ingunn Berget
Agricultural University of Norway
Department of Animal and Aquacultural Sciences

 Date: Fri, 22 Oct 2004 11:06:15 +0100
> From: Brian Lane <bsl8096 at liverpool.ac.uk>
> Subject: [BioC] Limma B-statistics
> To: bioconductor at stat.math.ethz.ch
> Message-ID: <E12C817EC5C08FD1B9BF7823 at 182105-93607r.liv.ac.uk>
> Content-Type: text/plain; charset=us-ascii; format=flowed
>
> Hi,
> I need some help with the interpretation of B statistics generated by
> eBayes in the limma package.
>
> I want to compare gene expression in three groups of Affy samples. The
> probe level data was generated from .cel files (ReadAffy()), an exprSet
> object was generated using mas5 (scaled to 100) and a linear model fitted
> to the data using a design based on the three groups (6, 5, and 5 samples
> in each group, respectively). I have then made 3 contrasts to cover all
> possible comparisons within the data set, and generated empirical Bayes
> statistics using eBayes. I've then used classifyTestsF to classify each
> gene according to the contrasts.
>
> The results of all this are 23 significantly differentially expressed
> genes. The moderated t-values for all these 23 genes have p<0.01. However,
> all the B-values are <0 (average -3!). In fact, a volcano plot of log-odds
> and fold-change in the three contrasts show that all the B-values are
> negative.
>
> My understanding is that B<0 implies the gene is more likely to not be
> differentially expressed than to be differentially expressed. If this is
> the case, should I take the "significant genes" seriously? If not, is there
> any reason why the B-values should all be negative or does this simply
> reflect the fact that there is little evidence of differential expression
> in the data set as a whole?

Yes, this is supposed to indicate little evidence of differential expression.

I think the problem is likely to be that you have used classifyTestsF() without any adjustment for
multiple testing.  Please note that classifyTestsF() does not adjust for multiple testing across
probe sets.  You are supposed to compute a low p-value yourself (lower than 0.01!) to give
classifyTestsF() which reflects the number of probe sets.  See the Ecoli case study in the User's
Guide for example.

I have found that that this aspect of classifyTestsF() is often mis-understood, so I recommend
that you switch to decideTests() in the newer version of limma instead of classifyTestsF().

You might find the section "Statistics for Differential Expression" in the User's Guide helpful.

Gordon

> Regards,
> Brian Lane
> Dept of Haematology
> Liverpool University

--------------------------------------------------------------------------------

	[[alternative HTML version deleted]]