[BioC] variance and coefficient of variation with edgeR

Miguel Gallach miguel.gallach at univie.ac.at
Tue Mar 27 18:12:01 CEST 2012


It seems I could not paste the plot... I hope you can see it now.

Sorry,
Miguel

=========

On Tue, Mar 27, 2012 at 4:22 PM, Miguel Gallach <miguel.gallach at univie.ac.at
> wrote:

> Dear list,
>
> I am analyzing RNA-Seq data with edgeR for a typical two factors design:
>
> $samples
>                  group lib.size norm.factors
> R4.Hot     HotAdaptedHot 17409289    0.9881635
> R5.Hot     HotAdaptedHot 17642552    1.0818144
> R9.Hot    ColdAdaptedHot 20010974    0.8621807
> R10.Hot   ColdAdaptedHot 14064143    0.8932791
> R4.Cold   HotAdaptedCold 11968317    1.0061084
> R5.Cold   HotAdaptedCold 11072832    1.0523857
> R9.Cold  ColdAdaptedCold 22386103    1.0520949
> R10.Cold ColdAdaptedCold 17408532    1.0903311
>
>
> I found something quite interesting and is that non-native populations
> have systematically higher coefficient of variation than native
> populations. This is: CV (R4.Hot-R5Hot) < CV(R9.Hot-R10.Hot) and
> CV(R4.Cold-R5.Cold) > CV(R9.Cold-R10.Cold).
>
> Here you have the variables and calculations:
>
> C.V.R4.R5HC = sqrt (data$R4.R5.HC.disp)
> C.V.R9.R10HC = sqrt (data$R9.R10.HC.disp)
>
> var_R4.R5_HC=Conc.R4.R5.HC*(1+R4.R5.HC.disp*Conc.R4.R5.HC)
> var_R9.R10_HC=Conc.R9.R10.HC*(1+R9.R10.HC.disp*Conc.R9.R10.HC)
>
>
> The attached plot is the result of comparing variances (V = mu *( 1 +
> dispersion * mu ), according to
> http://seqanswers.com/forums/showthread.php?t=5591&highlight=edgeR+variance)
> and C.V. (C.V. = sqrt(dispersion)) between biological groups at Hot
> temperature (i.e., comparin R4.Hot-R5.Hot vs. R9.Hot-R10.Hot).
>
> According to the left plot we can conclude that for most genes the
> variance is equal and then the assumption of equal variances is true. Hence
> we can perform DE test. Am I right?
>
> However, something I cannot understand is that the sqrt(R9.R10) >
> sqrt(R4.R5), i.e., the coefficient of variation of gene expression is
> systematically higher for all genes from R9.R10 than those in R4.R5. For
> this to be true, since variances are equal and C.V. = sqrt(var)/mean, then
> the mean of R9.R10 (i.e., Con.R9.R10) should be lower than that for R4.R5,
> which is obviously false. The reciprocal analysis for these samples at cold
> temperatures produces the equivalent, but  inverted, result.
>
> What am I missing? How can this happen?
>
>
> Any help would be appreciated.
>
> Many thanks,
> Miguel Gallach
>
>
>
>
>
>
>
>


-- 
Miguel Gallach
Center for Integrative Bioinformatics Vienna (CIBIV)
Max F. Perutz Laboratories(MFPL)
Telf: +43 1 4277 24029

Postal Address:
Ebene 1
Campus Vienna Biocenter 5
CIBIV, MFPL
1030 Vienna
Austria

e-mail:
miguel.gallach at univie.ac.at
migaca2001 at gmail.com


More information about the Bioconductor mailing list