[BioC] edgeR: new defaults of estimateTagwiseDisp and exactTest
Gordon K Smyth
smyth at wehi.EDU.AU
Fri Mar 8 08:06:19 CET 2013
Dear Zhuzhu,
The edgeR User's guide gives case studies of how we intend edgeR to be
used. You will see that we expect users to generally use the default
parameters. Although there are a great many parameters that can be varied
in principle in the edgeR functions, we not expect them to be changed in
most analyses.
One exception was the prior.n argument to estimateTagwiseDisp. It was
never our intention that a fixed value for prior.n would be used for
datasets of different sizes, so previous versions of the User's Guide used
to explain how to set this parameter. Several versions ago, we eliminated
the prior.n argument and replaced it with prior.df so that the default
behavior was what we intended.
Over time we have reduced the default value for prior.df somewhat.
Larger values of prior.df give more priority to genes with large fold
changes between groups. Smaller values of prior.df give more priority to
genes with small dispersion values, i.e., to genes that are consistent
between replicates.
You ask about exactTest(). To learn more about the different rejection
regions, you should start by reading the help page for exactTest(), but I
doubt that this is causing you a problem. The theory behind the original
exact negative binomial test of Robinson and Smyth (2008) presupposed that
the conditional distibution of the counts given the genewise total was
unimodal (like a binomial distribution). This is true for typical
dispersion values, but is not true for very large dispersion values.
Hence the original exactTest can give totally innappropriate results when
the dispersion is very, very large. The new rejection region is similar
to the original when the dispersion is smallish, but gives sensible
results in any situation. Hence it should be used. The old rejection
region is preserved as an option only for backward compatability.
Best wishes
Gordon
> Date: Wed, 06 Mar 2013 19:22:06 -0500
> From: Zhuzhu Zhang <zhuzhuz at email.unc.edu>
> To: bioconductor at r-project.org
> Subject: [BioC] edgeR: new defaults of estimateTagwiseDisp and
> exactTest
>
> Dear All,
>
> I'm running edgeR 3.0.8 and notice that the results are considerably
> different than those from older versions. I realized that it was likely
> due to the different prior.df I used in Function estimateTagwiseDisp,
> thanks to an earlier discussion on the list-
> https://stat.ethz.ch/pipermail/bioconductor/2012-December/049644.html
>
> I have a few more questions:
>
> 1. How to choose a proper prior.df (or prior.n)?
>
> 2. How is the new default method of exactText different from the old
> default (rejection.region = "smallp")? How does it improve the performance?
>
> 3. In general, what parameters should I tune for different datasets,
> when using function estimateTagwiseDisp and exactTest? I used the
> defaults but realized that they may not be most appropriate.
>
> Thank you for your time and attention. Any suggestions and comments
> would be extremely helpful and appreciated.
>
> Thanks,
> Zhuzhu
>
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list