[BioC] Calculating the average differential exprression

James W. MacDonald jmacdon at med.umich.edu
Mon Feb 6 15:18:22 CET 2006


Hi Mick,

michael watson (IAH-C) wrote:
> Hi
> 
> This question counts for most of the microarray packages...
> 
> When calculating the average differential expression for replicate
> spots, do you take an averge of the log2(ratio)'s, or the log2 of the
> average ratio?  (they come up with different numbers)
> 
> Which one and why?

The mean of the log2(ratios). The main reason for doing so is to make 
the data look more like they come from a Normal distribution. On the 
natural scale the down-regulated genes will have a range from (0,1] and 
the up-regulated genes will have a range from [1, inf), so the 
distribution of these data will have a strong right skew. If you take 
logs then the range will be (-inf, 0], [0, inf) for down and 
up-regulated genes, respectively.

Much of the analyses that we perform with these data assume that each 
datum is a Normal variate (for instance, the t-test), so you have to 
take logs first to make the data look more Normal.

Another reason is to minimize/eliminate any dependence of the variance 
on the expression level. In other words, we would like the variability 
of the data to be relatively consistent regardless of the spot 
intensity. Taking logs tends to help in this respect as well.

HTH,

Jim


> 
> Thanks
> Mick
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor


-- 
James W. MacDonald
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list