[R] "CV" for log normal data

peter dalgaard pdalgd at gmail.com
Wed Feb 22 00:43:21 CET 2012


On Feb 21, 2012, at 22:44 , array chip wrote:

> Hi, I have a microarray dataset from Agilent chips. The data were really log ratio between test samples and a universal reference RNA. Because of the nature of log ratios, coefficient of variation (CV) doesn't really apply to this kind of data due to the fact that mean of log ratio is very close to 0. What kind of measurements would people use to measure the dispersion so that I can compare across genes on the chip to find stably expressed genes? something similar to CV would be easily interpreted?

What's wrong with the SD of log(X)?? That's pretty much equivalent to CV at least for CV's less than 50%:

> x <- rlnorm(1000,5,.5)
> sd(x)/mean(x)
[1] 0.5252718
> sd(log(x))
[1] 0.5037995

Looking for a relative measure of precision _after_ taking log strikes me as very odd. If you scale your original observations by a constant factor, this will be _added_ to the log transformed data, without affecting their variation at all.


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list