[BioC] Limma: Very high logFC values

Sat Sep 8 05:12:09 CEST 2007

Dear Ingrid,

You are getting yourself into knots here. It is 
easier than that. The following should work fine

   x <- lumiR("yourIlluminaSummaryFile")
   y <- lumiT(x)
   y <- lumiN(y, method="quantile")
   fit <- lmFit(y, design)

Note that lumiT() is analogous to vsn but 
customised for Illumina data, and it returns 
expression values which are already on a log-scale.

It is very easy to check for yourself whether the 
data is on the log-scale or not. Just type

   summary(exprs(y))

Do the values vary from 0 to 16 (as for log-data) 
or from 0 to 640000 (as for raw data)?

Let me gently point out that you never did tell 
us exactly what commands you used which lead to 
the original problem you reported. So responders 
can only guess at what your real problem is. 
Since you say you have analysed Illumina data 
successfully in the past, I'd have to guess that 
the real problem in this case may arise from some 
aspect of your data or your analysis that you haven't yet told us about.

Best wishes
Gordon

>Date: Thu, 6 Sep 2007 13:18:04 +0200
>From: Ingrid H. G. ?stensen     <Ingrid.H.G.Ostensen at rr-research.no>
>Subject: Re: [BioC] Limma: Very high logFC values
>To: "Joern Toedling" <toedling at ebi.ac.uk>
>Cc: bioconductor at stat.math.ethz.ch
>
>Hi
>
>After reading all the responses to my e-mail I 
>have come up with the following idea:
>
>Read the data with lumiR
>Normalize it with lumiN using the quantile parameter.
>Get the expression values: norm_data <- exprs(data_norm)
>Use log2 on the expression data: data_n_exp <- 
>log2(norm_data) or use vsn or ....
>Use data_n_exp for the statistical analysis (for limma)
>
>Just one thing: Should I log2 transform the data 
>before or after the normalization?
>
>Regards,
>Ingrid
>
>
>
>
>Hello,
>
>limma's lmFit expects log-transformed expression values as input, and
>the returned "log fold-change" is then about the difference between the
>mean expression levels in each group, which given your input data may
>very well be 5000 but this is not the proper log fold change at all. I
>don't know what kind of data preprocessing you performed but you may
>want to make sure that you feed log-transformed data into lmFit.
>
>Regards,
>Joern
>
>Ingrid H. G. Østensen wrote:
> > Hi
> >
> > I am using limma to analyze Illumina 
> expression data (two groups), and this time I 
> got some really high logFC values for some genes and "low" for others. Example:
> >
> > Probe       log2 Ratio(logFC)     Moderated 
> t-statistic (t)  Raw p-value    Adjusted p-value     B
> > 
> ILMN_27575    5443.972                 27.305 
>               1.81E-06      0.009621899        -2.29002
> >
> > 
> ILMN_14823        42.5 
> 19.077                1.00E-05      0.022251754        -2.32116
> >
> > The first gene has intensities in one group 
> (3 samples) around 10 000 and in the other 
> group (3 samples) around 5000, and the second 
> gene has intensities around 110 in the first 
> group and around 80 in the second group.
> >
> > I have never seen so high logFC values 
> before, are they realistic? Does this values 
> mean that there are big differences combined with hight intensities?
> >
> > Regards,
> > Ingrid