[BioC] edgeR: GLM for multi-factor and mulit-level designs
Dorota Herman
dorota.herman at psb.vib-ugent.be
Tue Nov 27 16:18:20 CET 2012
Hello everyone,
I have been playing with the GLM approach for RNA-seq data in DESeq and
edgeR but I am fairly new in DE analyses. I am interested in pairwise
comparisons in multi-factor multi-level designs. My question concerns my
understanding of an application of the glmLRT function
#My code is
>countsTable <- read.delim(file)
>header <-
c('A_1','A_2','A_3','B_1','B_2','B_3','C_1','C_2','C_3','D_1','D_2','D_3')
>names(countsTable) <- header
>conds <- factor(c('A','A','A’,'B','B','B','C','C','C','D','D','D'))
>Ex<-factor(c('exper1', 'exper2', 'exper3', 'exper2', 'exper3', 'exper4',
'exper1', 'exper3', 'exper4', 'exper2', 'exper3', 'exper4'))
>group <- conds
>dge <- DGEList(counts=countsTable,group=group)
>dge <- calcNormFactors(dge, method='TMM')
>design <- model.matrix(~Ex+conds)
>rownames(design)<-colnames(dge)
>dge <- estimateGLMCommonDisp(dge,design)
>dge <- estimateGLMTrendedDisp(dge, design)
>dge <- estimateGLMTagwiseDisp(dge, design)
>fit <- glmFit(dge, design)
#my design looks like:
> design
(Intercept) Eexper2 Eexper3 Eexper4 condsB condsC condsD
A_1 1 0 0 0 0 0 0
A_2 1 1 0 0 0 0 0
A_3 1 0 1 0 0 0 0
B_1 1 1 0 0 1 0 0
B_2 1 0 1 0 1 0 0
B_3 1 0 0 1 1 0 0
C_1 1 0 0 0 0 1 0
C_2 1 0 1 0 0 1 0
C_3 1 0 0 1 0 1 0
D_1 1 1 0 0 0 0 1
D_2 1 0 1 0 0 0 1
D_3 1 0 0 1 0 0 1
attr(,"assign")
[1] 0 1 1 1 2 2 2
attr(,"contrasts")
attr(,"contrasts")$E
[1] "contr.treatment"
attr(,"contrasts")$conds
[1] "contr.treatment"
I understand that R function rewrote the model matrix because of the
identifiability problem for parameter estimations. However it causes my
confusion in further usage of that design for the pairwise comparisons.
In a case when I want to obtain differentially expressed genes between A
and B, I understand I should use the function:
>lrt <- glmLRT(fit,coef="condsB")
Is it correct?
In a case when I want to obtain differentially expressed genes between C
and D (*without taking into account A*), are these calling functions
correct?
>C_D<-makeContrasts(condsC-condsD,levels=design)
>lrt <- glmLRT(fit,contrast=C_D)
Does it mean that glmLRT function takes into account first conds (A)
when we use ‘coef’ parameter and discard it when we use ‘contrast’
parameter? Or it means that the second analysis, between C and D takes
into account differential expression with A too?
I hope my explanation of the question is not too confusing.
Best wishes
Dorota
--
==================================================================
Dorota Herman, PhD
VIB Department of Plant Systems Biology, Ghent University
Technologiepark 927
9052 Gent, Belgium
Tel: +32 (0)9 3313692
Email:dorota.herman at psb.vib-ugent.be
Web: http://www.psb.ugent.be
More information about the Bioconductor
mailing list