[BioC] edgeR and DESeq2: model design and estimation of dispersion
Gordon K Smyth
smyth at wehi.EDU.AU
Mon Jun 16 05:30:59 CEST 2014
Dear Iddo,
No, it is not valid to use a different design matrix for the dispersion
estimation.
edgeR will handle your model with 400 samples, but it will admitedly be
slow. If this is too slow, then switch to voom() in the limma package,
which will be very fast, or to glmQLFTest() in the edgeR package, which
will still be relatively slow but faster than the glm routines in edgeR
(or DESeq2).
Best wishes
Gordon
> From: Iddo Ben-dov <iddobe at ekmd.huji.ac.il>
> Subject: edgeR and DESeq2: model design and estimation of dispersion
> Date: June 12, 2014 at 4:51:51 PM GMT+3
> To: bioconductor at r-project.org
>
> hi,
>
> in both edgeR and DESeq2, estimation of dispersion precedes negative
> binomial GLM fitting.
>
> my question is, can I use a design formula when estimating dispersion
> which is different from the formula used for GLM fitting? specifically,
> I would like to use a simplified design when estimating dispersion and a
> full design for GLM fitting.
>
> my motivation for doing so is that with the full design estimation of
> dispersion is too demanding for my computer and time.
>
> my dataset includes 400 mRNAseq profiles (~22,000 genes). there are 100
> controls and 100 cases, and each was sampled twice - before and after
> intervention.
>
> thus, the full design is:
> ~ group*intervention + individual:group (blocking factor)
>
> as I mentioned, estimation of dispersion with the above design is not
> practical, and I thus would like to simplify to: ~ group*intervention
>
> and introduce the 'individual' blocking factor only for NB GLM fitting.
>
> is this statistically valid?
>
> appreciate any help,
> iddo
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list