[BioC] EdgeR multi-factor testing question
Gordon K Smyth
smyth at wehi.EDU.AU
Thu Jan 9 23:34:51 CET 2014
Dear Yanzhu,
Yes, that's how I would do it. Keep the same dispersions for all fits.
Best wishes
Gordon
> Date: Wed, 8 Jan 2014 06:36:16 -0800 (PST)
> From: "Yanzhu [guest]" <guest at bioconductor.org>
> To: bioconductor at r-project.org, mlinyzh at gmail.com
> Subject: [BioC] EdgeR multi-factor testing question
>
> Dear Gordon,
>
> I have one more question about the estimation of dispersion.
>
> When the three-way interaction term is insignificant, I will fit the
> model 2 without the three-way interaction to test the two-way
> interaction terms. When all interaction terms are insignificant, I fit
> the additive model (model 3) to test the main effect. Could I use the
> same dispersion for all the models, i.e., model 1 (including
> everything), model 2 (without three-way interaction term) and model 3
> (additive model)? Could this dispersion be estimated under design of
> model 1?
>
> Thank you!
> Yanzhu
>
> ---------------------------------------------------------
>
> Dear Yanzhu,
>
> Your analysis is fine from a code point of view. From a statistical point
> of view however your analysis is too simple because you are neglecting the
> principle of marginality:
>
> http://en.wikipedia.org/wiki/Principle_of_marginality
>
> For the model you have fitted, it makes sense to test for the three-way
> interaction as you do. However it does not make statistical sense to test
> for the main effects or two-interactions until you have established that
> the three-way interaction is non-significant.
>
> For count data, the tests for the lower-level interactions need to be
> computed by successively removing each level of interactions from the
> model. See for example:
>
> https://stat.ethz.ch/pipermail/bioconductor/2013-December/056584.html
>
> This is the same as the anova() function does in R for unbalanced linear
> factorial models.
>
> Furthermore, testing the two-way interations is only sensible for genes
> with non-signicant 3-way interactions. Similarly, testing the main effect
> is only sensible for genes with non-significant 2-way and 3-way
> interactions. Otherwise these tests have no useful scientific meaning.
>
> This is a basic drawback of the factorial anova approach. You might
> consider the alternative approach described in Section 3.3.1 of the edgeR
> User's Guide.
>
> Best wishes
> Gordon
>
>
>
>
> -- output of sessionInfo():
>
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] edgeR_3.2.4 limma_3.16.8
>
> loaded via a namespace (and not attached):
> [1] tools_3.0.1
>
>
> --
> Sent via the guest posting facility at bioconductor.org.
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list