[BioC] Estimating fold change from limma 'log2FC' using lumi

Gordon K Smyth smyth at wehi.EDU.AU
Thu Feb 16 22:54:40 CET 2012


Dear Ross,

Short answer:

The vst transformation is asymptotically equivalent to the log2, and 
2^logFC is the correct transformation.

Longer answer:

See

   http://www.ncbi.nlm.nih.gov/pubmed/20929874

for why vst gives small fold changes, and what you should do about it.

BTW, there is nothing "linear" about raw scale intensities for 
fold-changes.  The logFC are far more linear in their statistical 
behaviour.

Best wishes
Gordon

> Date: Thu, 16 Feb 2012 14:58:02 +1100
> From: Ross <ross.lazarus at gmail.com>
> To: bioconductor at r-project.org
> Subject: [BioC] Estimating fold change from limma 'log2FC' using lumi
> 	vst instead of log2 transformation?
>
> Dear Bioconductors,
>
> I have a question related to estimates of microarray expression fold change
> after using vst in lumi (as the authors recommend) rather than a simple
> log2 transformation. I apologise if this is a foolish question or if my
> google-fu was too weak to find the answer when I searched.
>
> I am working with illumina sentrix expression data, using lumi's vst and
> quantile normalization, then limma to model differential expression.
>
> One of my collaborators has asked for the fold changes reported by limma
> (as log2FC) on a linear scale so he can check the top ones with qPCR.
>
> Initially, I was going to make the obvious suggest that he take 2^log2FC to
> transform what lumi reported as log2FC back to a linear scale - but this
> probably not the right transformation - because vst is not log2!
>
> To confirm this, I reran the analysis replacing the lumi vst option with a
> log2 transformation to see what differences there were.
>
> As can be seen from the small sample below, there were variations in gene
> ranking, differences in p values (vst gave smaller p values) and the values
> reported as 'log2FC' were vastly different.
>
> I understand that limma estimates "log2FC" as the difference between
> transformed intensity values - but how can they be interpreted if the data
> were transformed with vst rather than log2? I imagine that the vst results
> are 'better' because it's a better transformation than log2 but I am not
> sure how to explain this to my colleague - advice appreciated.
>
>
> Using vst (apologies for the formatting):
>
>                      ID geneSymbol     logFC    CI.025    CI.975   AveExpr
>        t      P.Value    adj.P.Val        B
> 5447  BJnZRnS5W.xQwHiYVQ     EEF1A2    6.737470  6.577077  6.897863
> 10.133500 137.41304 2.694315e-70 7.260639e-66 147.7207
> 23109 rQlXje3RRE1fQVEa3k        SLN    7.534754  7.345115  7.724392
> 10.666689 129.97466 5.361216e-69 7.223702e-65 145.1452
> 18091 leKDHioqXpx7d3kd54     MYBPC1    7.524289  7.322845  7.725733
> 10.609381 122.18770 1.481136e-67 1.330455e-63 142.2374
> 18134 K_W3elXdKe58XuEknc       MYL3    7.076920  6.862857  7.290984
> 10.309408 108.14785 1.038158e-64 6.994069e-61 136.3602
> 18123 llE343nSV6K5SQJVOc       MYH7    7.296190  7.070833  7.521547
> 10.494978 105.91110 3.185741e-64 1.716987e-60 135.3381
>
>
>
> Using log2:
>
>                     ID geneSymbol     logFC    CI.025    CI.975   AveExpr
>        t      P.Value    adj.P.Val        B
> 5447  BJnZRnS5W.xQwHiYVQ     EEF1A2   8.833916  8.552823  9.115010
> 9.164795 102.80600 8.505465e-67 2.292053e-62 139.7942
> 18134 K_W3elXdKe58XuEknc       MYL3   9.145482  8.812582  9.478383
> 9.337462  89.86878 1.832162e-63 2.468655e-59 133.0097
> 18179 9X8dQtVHUZNfXRQdIk      MYOZ1   8.509667  8.194560  8.824774
> 9.452997  88.34276 4.864153e-63 4.369306e-59 132.1303
> 23659 6Ij1VCKLt6Cje5ehLo        SRL   6.949980  6.685096  7.214864
> 8.398941  85.83111 2.517144e-62 1.695800e-58 130.6422
> 24961 3IlVNOVFFR3pHp9Nfw       TPM2   8.845876  8.502794  9.188958
> 8.828952  84.34499 6.808768e-62 3.669654e-58 129.7369
>
>
>> sessionInfo()
> R version 2.14.1 (2011-12-22)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C
> [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8
> [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8
> [7] LC_PAPER=C                 LC_NAME=C
> [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] lumi_2.6.0      nleqslv_1.9.2   methylumi_2.0.4 Biobase_2.14.0
> [5] limma_3.10.2
>
> loaded via a namespace (and not attached):
> [1] affy_1.32.1           affyio_1.22.0         annotate_1.32.1
> [4] AnnotationDbi_1.16.11 BiocInstaller_1.2.1   DBI_0.2-5
> [7] grid_2.14.1           hdrcde_2.15           IRanges_1.12.5
> [10] KernSmooth_2.23-7     lattice_0.20-0        MASS_7.3-16
> [13] Matrix_1.0-3          mgcv_1.7-13           nlme_3.1-103
> [16] preprocessCore_1.16.0 RSQLite_0.11.1        xtable_1.6-0
> [19] zlibbioc_1.0.0
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list