[BioC] EdgeR multi-factor testing question
Yanzhu [guest]
guest at bioconductor.org
Wed Jan 8 15:36:16 CET 2014
Dear Gordon,
I have one more question about the estimation of dispersion.
When the three-way interaction term is insignificant, I will fit the model 2 without the three-way interaction to test the two-way interaction terms. When all interaction terms are insignificant, I fit the additive model (model 3) to test the main effect. Could I use the same dispersion for all the models, i.e., model 1 (including everything), model 2 (without three-way interaction term) and model 3 (additive model)? Could this dispersion be estimated under design of model 1?
Thank you!
Yanzhu
---------------------------------------------------------
Dear Yanzhu,
Your analysis is fine from a code point of view. From a statistical point
of view however your analysis is too simple because you are neglecting the
principle of marginality:
http://en.wikipedia.org/wiki/Principle_of_marginality
For the model you have fitted, it makes sense to test for the three-way
interaction as you do. However it does not make statistical sense to test
for the main effects or two-interactions until you have established that
the three-way interaction is non-significant.
For count data, the tests for the lower-level interactions need to be
computed by successively removing each level of interactions from the
model. See for example:
https://stat.ethz.ch/pipermail/bioconductor/2013-December/056584.html
This is the same as the anova() function does in R for unbalanced linear
factorial models.
Furthermore, testing the two-way interations is only sensible for genes
with non-signicant 3-way interactions. Similarly, testing the main effect
is only sensible for genes with non-significant 2-way and 3-way
interactions. Otherwise these tests have no useful scientific meaning.
This is a basic drawback of the factorial anova approach. You might
consider the alternative approach described in Section 3.3.1 of the edgeR
User's Guide.
Best wishes
Gordon
-- output of sessionInfo():
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] edgeR_3.2.4 limma_3.16.8
loaded via a namespace (and not attached):
[1] tools_3.0.1
--
Sent via the guest posting facility at bioconductor.org.
More information about the Bioconductor
mailing list