[BioC] limma
Gordon K Smyth
smyth at wehi.EDU.AU
Thu Apr 7 03:29:41 CEST 2011
Dear Seraya,
Wei Shi has put his finger on the issue in a recent reply to this thread.
Let me elaborate.
Firstly, the true fold change for a gene expressed in one condition but
not the other is Infinity. While true, this a bit unhelpful, because an
Infinite fold change doesn't tell you whether the gene is highly expressed
or just barely expressed in the condition for which it is expressed.
limma gets around this problem in the following way (assuming that you're
using limma preprocessing as well as linear model fitting). limma offsets
all the expression values away from zero, so that all genes get a minimum
expression level.
If you use the limma neqc() function to normalize Illumina data, the
default offset is 16, translating to 4 on the log2 scale. This is why
your AveExpr values will never be less than 4. So, when a gene is absent
in one condition, and has average expression value x in the other, the
fold changes is computed something like:
logFC = log2( (x+16) / (0+16) )
This means that limma never returns an infinite fold change. Note also
that the denominator is not noise, rather it is the offset plus a very
small amount of noise. This means that the estimated fold change is not
unstable or highly variable. It is quite stable, but biased. The act of
offsetting the expression values away from zero means that the fold
changes tend to be underestimated, although the bias is negligible for
highly expressed genes. Generally speaking, the gain in noise reduction
and statistical power that arises from using a small offset far outways
the disadvantage of biasing the fold changes changes. This has been
extensively discussed in the recent paper:
Shi, W, Oshlack, A, and Smyth, GK (2010). Optimizing the noise versus bias
trade-off for Illumina Whole Genome Expression BeadChips. Nucleic Acids
Research 38, e204.
By the way, you might like to try out the propexpr() function in limma
also, see:
Shi, W, de Graaf, C, Kinkel, S, Achtman, A, Baldwin, T, Schofield, L,
Scott, H, Hilton, D, Smyth, GK (2010). Estimating the proportion of
microarray probes expressed in an RNA sample. Nucleic Acids Research 38,
2168-2176.
You could say to the reviewer: "limma ensures that all probes are assigned
at least a minimum non-zero expression level on all arrays, in order to
minimize the variability of log-intensities for lowly expressed probes.
Probes that are expressed in one condition but not other will be assigned
a large fold change for which the denominator is the minimum expression
level. This approach has the advantage that genes can be ranked by fold
change in a meaningful way, because genes with larger expression
expression changes will always be assigned a larger fold change."
Best wishes
Gordon
> Date: Tue, 05 Apr 2011 18:05:08 +0200
> From: "Seraya Maouche" <Seraya.Maouche at uk-sh.de>
> To: <Bioconductor at r-project.org>
> Subject: [BioC] Limma
>
> Dear Prof Gordon, dear Bioconductor members,
>
> I have performed gene expression analysis using Limma (Illumina human
> ref8) comparing two types of cells (referred below as cond1 and cond2).
> Based on detection call, I filtered out transcripts which are absent in
> both types of cells. Transcripts which were expressed only in one cell
> type were included in the analysis.
>
> I have received the comment below from a reviewer who seems not agree to
> calculate fold change for genes expressed only in one condition. Would
> it be possible to have your opinion about this.
>
> Thank you in advance for your time,
> S Maouche
>
> "There is a little conceptual difficulty related to the cond1/cond2
> comparisons for genes that are considered not detected. If a gene
> product is absent (0) in one cell then no fold change can be computed
> (table 2). I don?t know how to circumvent this difficult except by
> saying that the ?noise? is considered to reflect low expression. The
> terms ?not detected? and ?not expressed? are often used interchangeably
> while this is not the same. Detection is based on the definition adopted
> and in many places of the manuscript it should be used in place of
> expression."
>
>
>
> Universit?tsklinikum Schleswig-Holstein
> Rechtsf?hige Anstalt des ?ffentlichen Rechts der
> Christian-Albrechts-Universit?t zu Kiel und der Universit?t zu L?beck
>
> Vorstandsmitglieder: Prof. Dr. Jens Scholz (Vorsitzender), Peter
> Pansegrau, Christa Meyer
> Vorsitzende des Aufsichtsrates: Dr. Cordelia Andre?en
> Bankverbindungen: F?rde Sparkasse BLZ 210 501 70 Kto.-Nr. 100 206,
> Commerzbank AG BLZ 230 800 40 Kto.-Nr. 300 041 200
>
> Diese E-Mail enth?lt vertrauliche Informationen und ist nur f?r die
> Personen bestimmt, an welche sie gerichtet ist. Sollten Sie nicht der
> bestimmungsgem??e Empf?nger sein, bitten wir Sie, uns hiervon
> unverz?glich zu unterrichten und die E-Mail zu vernichten.
> Wir weisen darauf hin, dass der Gebrauch und die Weiterleitung einer
> nicht bestimmungsgem?? empfangenen E-Mail und ihres Inhalts gesetzlich
> verboten sind und ggf. Schadensersatzanspr?che ausl?sen k?nnen.
>
>
> ------------------------------
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list