[BioC] edgeR - one way ANOVA, coef parameter interpretation in glmLRT()
Mark Robinson
mark.robinson at imls.uzh.ch
Mon Jun 10 20:46:35 CEST 2013
Hey Michal,
I've added some comments in line below …
On 10.06.2013, at 15:34, Michal Okoniewski <michal.okoniewski at fgcz.ethz.ch> wrote:
> Hey Mark,
>
> A question on one-way ANOVA design with edgeR:
>
> I do something like that for 3 groups, so similar case like in the vignette (glm functionality):
>
>
> d <- calcNormFactors(d)
>
> d <- estimateCommonDisp(d, verbose=TRUE)
>
> d <- estimateTagwiseDisp(d)
>
>
> TS <- factor(cc)
>
> design<- model.matrix(~0+TS)
>
> colnames(design) <- levels(TS)
You seem to have classic edgeR (from 2.7, [1]) but probably want GLM edgeR (from 2.8, [1]), no? See below.
> fit <- glmFit(d, design)
>
> fit2 <- glmLRT(fit)
>
> # or fit2 <- glmLRT(fit, coef=1) ???
>
>
> outTab <- topTags(fit, n=30666)
>
> An the question is:
> __ How to get the p-values of the F-test for one-way anova? In a basic comparison of 3 groups… __
>
> In the vignette you have written:
>
> "The fi t has three parameters. The fi rst is the baseline level of group 1. The second and third
> are the 2 vs 1 and 3 vs 1 di ferences." edgeR vignette (p17, date March31st 2013)
>
> This sounds to me a bit like hard-coded contrasts… in addition, default value of coef is ncol(glmfit$design), so 3 in this case. Shall I use coef=1?
> I am a bit confused…
>
> Thanks!
You need to be careful of what you use for the design matrix, or more precisely, how you pair your design matrix with your contrast matrix (or what coefficients of the design matrix you specify). Let me try and spell it out.
In the vignette (2.8.3, [1]), the design matrix is defined as:
group <- factor(c(1,1,2,2,3,3))
design <- model.matrix(~group)
> design
(Intercept) group2 group3
1 1 0 0
2 1 0 0
3 1 1 0
4 1 1 0
5 1 0 1
6 1 0 1
Note that your design matrix, as defined above, is *different*, like:
group <- factor(c(1,1,2,2,3,3))
design <- model.matrix(~0+group)
> design
group1 group2 group3
1 1 0 0
2 1 0 0
3 0 1 0
4 0 1 0
5 0 0 1
6 0 0 1
So, what to do next (in terms of glmLRT) depends on how you've made the design matrix.
For a 1-way ANOVA style comparison with 3 levels of the factor, I find the following the simplest:
d <- DGEList(...)
d <- calcNormFactors(d)
-----
design <- model.matrix(~group)
d <- estimateGLMCommonDisp(d, design)
d <- estimateGLMTrendedDisp(d, design)
d <- estimateGLMTagwiseDisp(d, design)
f <- glmFit(d, design)
lrt <- glmLRT(f, coef=2:3)
topTags(lrt)
-----
but note that this is equivalent to:
-----
design <- model.matrix(~0+group)
d <- estimateGLMCommonDisp(d, design)
d <- estimateGLMTrendedDisp(d, design)
d <- estimateGLMTagwiseDisp(d, design)
f <- glmFit(d, design)
lrt <- glmLRT(f, contrast=matrix(c(-1,1,0,0,-1,1),ncol=2))
topTags(lrt)
-----
So, really, it's a matter of pairing your design matrix with the correct 'coef' or 'contrast' argument of glmLRT().
Hope that helps,
Mark
[1] http://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf
> I send a copy to BioC list, as I could not find the answer there.
> If I ask stupid question – please give me a simple solution too :)
>
> Cheers,
> Michal
>
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] edgeR_3.2.3 limma_3.16.5
>
> loaded via a namespace (and not attached):
> [1] tools_3.0.1
>
>
>
> --
> Dr. Michal Okoniewski
>
> Functional Genomics Center Zurich (mostly Wed-Fri)
> Winterthurerstrasse 190, 8057 Zurich
> Room Y32 H66, Phone: +41 44 635 39 24
>
> Department of Neuroimmunology and MS Research,
> UniSpital Zurich, (mostly Mon-Thu)
> Raemistrasse 100, 8091 Zurich
> Room: OPS D37, Phone: +41 44 255 1756
More information about the Bioconductor
mailing list