[BioC] question edgeR multifactorial design
Gordon K Smyth
smyth at wehi.EDU.AU
Wed Jan 15 00:41:59 CET 2014
Dear David,
> Date: Mon, 13 Jan 2014 13:25:38 +0100
> From: David Rengel <david.rengel at toulouse.inra.fr>
> To: bioconductor at r-project.org
> Subject: [BioC] question edgeR multifactorial design
>
> Hi,
>
> I am carrying out a multifactorial analysis on edgeR according to the
> experimental design shown at the end of this message (which I called
> design.all).
>
> My reference genotype being "Col", I created the following model matrix
> and its subsequent glmFit model, in order to see for expression
> modulation according to genotype, time, as well as their interaction:
>
> matrix.all<- model.matrix(~gtype*time, data=design.all)
> colnames(matrix.all)
> # [1]
> "(Intercept)""gtypeOE1""gtypeR24""gtypeX710""timeT1.5""timeT3""timeT6""gtypeOE1:timeT1.5"
>
> # [9] "gtypeR24:timeT1.5""gtypeX710:timeT1.5"
> "gtypeOE1:timeT3""gtypeR24:timeT3""gtypeX710:timeT3""gtypeOE1:timeT6""gtypeR24:timeT6""gtypeX710:timeT6"
> fit.all <- glmFit (dge.all, matrix.all)
>
> Following the 3.3.4 section of the user guide, I carried on as follows:
>
> lrt.all.gtype <- glmLRT(fit.all, coef=2:4)
> lrt.all.time<- glmLRT(fit.all, coef=5:7)
> lrt.all.inter <- glmLRT(fit.all, coef=8:16)
>
> Which leaves me with two questions that remain unclear to me:
> 1-Section 3.3.4. of the user guide states that coefficients 3 and 4 of
> the depicted example correspond to "the effects of the placebo at 1 hour
> and 2 hours". Why just placebo? Does not it include the "Drug".
Because that is how factorial models are defined in R by default. The
main effect for a factor is always relative to the treatment condition
with all factors at their base levels. For the example in the User's
Guide, the base levels are "Placebo" and "0h". So the Time main effect
changes the Time while keeping Drug at the Placebo level. And the Drug
effect changes the Drug while keeping Time at the 0h level.
> In my case, do coefficients 5 to 7 hence refer to "Col" at different
> times? Do they not include the other genotypes at the corresponding
> times? 2-Concerning the interaction, why does neither "Col" nor "T0"
> appear on the coefficients?
Because they are the base levels, and all coefficients are relative to
them.
> Should not they be part of the comparison?
Well, it depends on what question you want to answer.
> Do I not need them to follow the difference between genotypes in time?
> Section 3.3.4.states that "(coef=5:6)...detects genes that respond
> differently to the drug, relative to the placebo, at either of the times
> (i.e.1h and 2h)
I am a bit unclear as to what questions you are trying to answer about
your data. I don't want to give a long tutorial on factorial models,
because model.matrix() is a part of the standard installation of R rather
than part of edgeR and because I don't recommend factorial models.
If you are not familiar with factorial models in R, then it better not to
use them. (Actually I don't recommend factorial models for genomic
experiments, even if you are an expert on them.)
You have a 4x4 factorial model with 3 replicates per combination. I
recommend that you setup a single factor representing all 16 possible
conditions, then use contrasts to make the comparisons you want to make.
This approach is outlined in Section 3.3.1 of the User's Guide. This
gives identical results to the factorial model, but allows you to make the
comparisons you want to make in a more explicit way, so that you actually
know what you are testing.
Best wishes
Gordon
> I thank you any help you could provide me on this. Thank you for your work.
>
> Kind regards,
> David Rengel
>
>
> gtype time
> Col.T0.R2 Col T0
> Col.T0.R3 Col T0
> Col.T1.5.R1 Col T1.5
> Col.T1.5.R2 Col T1.5
> Col.T1.5.R3 Col T1.5
> Col.T3.R1 Col T3
> Col.T3.R2 Col T3
> Col.T3.R3 Col T3
> Col.T6.R1 Col T6
> Col.T6.R2 Col T6
> Col.T6.R3 Col T6
> OE1.T0.R2 OE1 T0
> OE1.T0.R3 OE1 T0
> OE1.T1.5.R1 OE1 T1.5
> OE1.T1.5.R2 OE1 T1.5
> OE1.T1.5.R3 OE1 T1.5
> OE1.T3.R1 OE1 T3
> OE1.T3.R2 OE1 T3
> OE1.T3.R3 OE1 T3
> OE1.T6.R1 OE1 T6
> OE1.T6.R2 OE1 T6
> OE1.T6.R3 OE1 T6
> R24.T0.R2 R24 T0
> R24.T0.R3 R24 T0
> R24.T1.5.R1 R24 T1.5
> R24.T1.5.R2 R24 T1.5
> R24.T1.5.R3 R24 T1.5
> R24.T3.R1 R24 T3
> R24.T3.R2 R24 T3
> R24.T3.R3 R24 T3
> R24.T6.R1 R24 T6
> R24.T6.R2 R24 T6
> R24.T6.R3 R24 T6
> 710.T0.R2 X710 T0
> 710.T0.R3 X710 T0
> 710.T1.5.R1 X710 T1.5
> 710.T1.5.R2 X710 T1.5
> 710.T1.5.R3 X710 T1.5
> 710.T3.R1 X710 T3
> 710.T3.R2 X710 T3
> 710.T3.R3 X710 T3
> 710.T6.R1 X710 T6
> 710.T6.R2 X710 T6
> 710.T6.R3 X710 T6
>
> --
> David Rengel
> Laboratoire des Interactions Plantes Micro-organismes (LIPM)
> INRA/CNRS
> 24 Chemin de Borde Rouge
> Auzeville
> CS 52627
> 31326 Castanet Tolosan Cedex
>
> E mail: david.rengel at toulouse.inra.fr
> Tel 33 (0)5 61 28 55 91
> Fax 33 (0)5 61 28 50 61
>
> http://www.toulouse.inra.fr/lipm
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list