[BioC] question edgeR multifactorial design

Gordon K Smyth smyth at wehi.EDU.AU
Wed Jan 15 00:41:59 CET 2014

Dear David,

> Date: Mon, 13 Jan 2014 13:25:38 +0100
> From: David Rengel <david.rengel at toulouse.inra.fr>
> To: bioconductor at r-project.org
> Subject: [BioC] question edgeR multifactorial design
> Hi,
> I am carrying out a multifactorial analysis on edgeR according to the
> experimental design shown at the end of this message (which I called
> design.all).
> My reference genotype being "Col", I created the following model matrix
> and its subsequent glmFit model, in order to see for expression
> modulation according to genotype, time, as well as their interaction:
> matrix.all<- model.matrix(~gtype*time, data=design.all)
> colnames(matrix.all)
> # [1]
> "(Intercept)""gtypeOE1""gtypeR24""gtypeX710""timeT1.5""timeT3""timeT6""gtypeOE1:timeT1.5"
> # [9] "gtypeR24:timeT1.5""gtypeX710:timeT1.5"
> "gtypeOE1:timeT3""gtypeR24:timeT3""gtypeX710:timeT3""gtypeOE1:timeT6""gtypeR24:timeT6""gtypeX710:timeT6"
> fit.all <- glmFit (dge.all, matrix.all)
> Following the 3.3.4 section of the user guide, I carried on as follows:
> lrt.all.gtype <- glmLRT(fit.all, coef=2:4)
> lrt.all.time<- glmLRT(fit.all, coef=5:7)
> lrt.all.inter <- glmLRT(fit.all, coef=8:16)
> Which leaves me with two questions that remain unclear to me:
> 1-Section 3.3.4. of the user guide states that coefficients 3 and 4 of
> the depicted example correspond to "the effects of the placebo at 1 hour
> and 2 hours". Why just placebo? Does not it include the "Drug".

Because that is how factorial models are defined in R by default.  The 
main effect for a factor is always relative to the treatment condition 
with all factors at their base levels.  For the example in the User's 
Guide, the base levels are "Placebo" and "0h".  So the Time main effect 
changes the Time while keeping Drug at the Placebo level.  And the Drug 
effect changes the Drug while keeping Time at the 0h level.

> In my case, do coefficients 5 to 7 hence refer to "Col" at different 
> times? Do they not include the other genotypes at the corresponding 
> times? 2-Concerning the interaction, why does neither "Col" nor "T0" 
> appear on the coefficients?

Because they are the base levels, and all coefficients are relative to 

> Should not they be part of the comparison?

Well, it depends on what question you want to answer.

> Do I not need them to follow the difference between genotypes in time? 
> Section 3.3.4.states that "(coef=5:6)...detects genes that respond 
> differently to the drug, relative to the placebo, at either of the times 
> (i.e.1h and 2h)

I am a bit unclear as to what questions you are trying to answer about 
your data.  I don't want to give a long tutorial on factorial models, 
because model.matrix() is a part of the standard installation of R rather 
than part of edgeR and because I don't recommend factorial models.

If you are not familiar with factorial models in R, then it better not to 
use them.  (Actually I don't recommend factorial models for genomic 
experiments, even if you are an expert on them.)

You have a 4x4 factorial model with 3 replicates per combination.  I 
recommend that you setup a single factor representing all 16 possible 
conditions, then use contrasts to make the comparisons you want to make. 
This approach is outlined in Section 3.3.1 of the User's Guide.  This 
gives identical results to the factorial model, but allows you to make the 
comparisons you want to make in a more explicit way, so that you actually 
know what you are testing.

Best wishes

> I thank you any help you could provide me on this. Thank you for your work.
> Kind regards,
> David Rengel
>            gtype time
> Col.T0.R2    Col   T0
> Col.T0.R3    Col   T0
> Col.T1.5.R1  Col T1.5
> Col.T1.5.R2  Col T1.5
> Col.T1.5.R3  Col T1.5
> Col.T3.R1    Col   T3
> Col.T3.R2    Col   T3
> Col.T3.R3    Col   T3
> Col.T6.R1    Col   T6
> Col.T6.R2    Col   T6
> Col.T6.R3    Col   T6
> OE1.T0.R2    OE1   T0
> OE1.T0.R3    OE1   T0
> OE1.T1.5.R1  OE1 T1.5
> OE1.T1.5.R2  OE1 T1.5
> OE1.T1.5.R3  OE1 T1.5
> OE1.T3.R1    OE1   T3
> OE1.T3.R2    OE1   T3
> OE1.T3.R3    OE1   T3
> OE1.T6.R1    OE1   T6
> OE1.T6.R2    OE1   T6
> OE1.T6.R3    OE1   T6
> R24.T0.R2    R24   T0
> R24.T0.R3    R24   T0
> R24.T1.5.R1  R24 T1.5
> R24.T1.5.R2  R24 T1.5
> R24.T1.5.R3  R24 T1.5
> R24.T3.R1    R24   T3
> R24.T3.R2    R24   T3
> R24.T3.R3    R24   T3
> R24.T6.R1    R24   T6
> R24.T6.R2    R24   T6
> R24.T6.R3    R24   T6
> 710.T0.R2   X710   T0
> 710.T0.R3   X710   T0
> 710.T1.5.R1 X710 T1.5
> 710.T1.5.R2 X710 T1.5
> 710.T1.5.R3 X710 T1.5
> 710.T3.R1   X710   T3
> 710.T3.R2   X710   T3
> 710.T3.R3   X710   T3
> 710.T6.R1   X710   T6
> 710.T6.R2   X710   T6
> 710.T6.R3   X710   T6
> -- 
> David Rengel
> Laboratoire des Interactions Plantes Micro-organismes (LIPM)
> 24 Chemin de Borde Rouge
> Auzeville
> CS 52627
> 31326 Castanet Tolosan Cedex
> E mail: david.rengel at toulouse.inra.fr
> Tel 33 (0)5 61 28 55 91
> Fax 33 (0)5 61 28 50 61
> http://www.toulouse.inra.fr/lipm

The information in this email is confidential and intend...{{dropped:4}}

More information about the Bioconductor mailing list