[BioC] edgeR: GLM & residuals and model fitting & hypothesis testing

Tue Feb 14 17:44:31 CET 2012

Dear all,

1) GLM & residuals:

I have a question concerning the use of GLMs in edgeR and the analysis
of the residuals after model fitting.

I have followed all the steps until model fitting, e.g.:
glmfit.D <- glmFit(D, design, dispersion = D$tagwise.dispersion)

The results I obtain from the fitting are the following catgories:
> names(glmfit.D)
 [1] "coefficients"  "fitted.values" "fail"          "not.converged"
 [5] "deviance"      "df.residual"   "abundance"     "design"
 [9] "offset"        "dispersion"    "method"        "counts"
[13] "samples"

What would be the best way to obtain the residuals for the "genewise" GLMs?

2) model fitting & hypothesis testing:

I have a fully crossed design with 2 factors and 2 factor levels each:
individual <- as.factor(c("indA","indA","indB","indB"))
treatment <- as.factor(c("treat1","treat2","treat1","treat2"))

in general I would be interested in 3 different aspects:
a) effect of individual
b) effect of treatment
c) interaction between individual and treatment

What would be the best way to test for those effects, would I rather
test for all three aspects individually, i.e.:
a) design <- model.matrix(~individual)
b) design <- model.matrix(~treatment)
c) design <- model.matrix(~individual*treatment)

or doesn't it also make sense to model
design <- model.matrix(~individual+treatment)
and test for
a) lrt.cd_ind <- glmLRT(D, glmfit.D, coef=2)
b) lrt.cd_treat <- glmLRT(D, glmfit.D, coef=3)
... this way the effect of both factors could be accounted for in the model?!

Thanks a lot!
Susanne