Hi all,

I have spent the past few days searching the newsgroup archives,  
reading vignettes, books, etc. in search of a method for ranking  
genes according to differential expression between a disease state, a  
normal state, and a control state, given Illumina BeadStudio summary  
data.

There is a thread (called "illumina --> limma?") which notes that  
Illumina's suggested method for background correction and  
normalization gives negative values, which are incompatible with the  
log2 transformation of expression data common before using lmFit.  An  
email from Wolfgang Huber in this thread suggests using vsn, but how  
can this work to background correct and normalize data in a way that  
is suitable for lmFit?  Is it as simple as running vsn(exprs 
(SummaryData)) and then multiplying the resulting matrix by log2(exp 
(1))?  (The SummaryData object is one that would be created from  
running the beadarray package function readBeadSummaryData(), and is  
essentially an ExpressionSet).

Mark Dunning recommends using non-normalized data (http:// 
article.gmane.org/gmane.science.biology.informatics.conductor/9721)  
for analysis on the log2 scale, which he seems to suggest is  
preferable to a linear scale, because of "a very obvious relationship  
between the mean and variance."  However, isn't normalization of the  
data essential to make accurate comparisons as required for  
differential expression analysis?  Is it a bad idea to analyze the  
data not on a log2 scale?

If anyone has done a differential expression analysis using R and  
Illumina data, could they please respond with their method and any  
comments on its pros and cons?

Many thanks,

Todd DeLuca
Biological Software Engineer

Center for Biomedical Informatics
Computational Biology Initiative
Harvard Medical School
10 Shattuck Street
Boston, MA 02115


	[[alternative HTML version deleted]]