[BioC] edgeR: new defaults of estimateTagwiseDisp and exactTest

Fri Mar 8 08:06:19 CET 2013

Dear Zhuzhu,

The edgeR User's guide gives case studies of how we intend edgeR to be 
used.  You will see that we expect users to generally use the default 
parameters.  Although there are a great many parameters that can be varied 
in principle in the edgeR functions, we not expect them to be changed in 
most analyses.

One exception was the prior.n argument to estimateTagwiseDisp.  It was 
never our intention that a fixed value for prior.n would be used for 
datasets of different sizes, so previous versions of the User's Guide used 
to explain how to set this parameter.  Several versions ago, we eliminated 
the prior.n argument and replaced it with prior.df so that the default 
behavior was what we intended.

Over time we have reduced the default value for prior.df somewhat. 
Larger values of prior.df give more priority to genes with large fold 
changes between groups.  Smaller values of prior.df give more priority to 
genes with small dispersion values, i.e., to genes that are consistent 
between replicates.

You ask about exactTest().  To learn more about the different rejection 
regions, you should start by reading the help page for exactTest(), but I 
doubt that this is causing you a problem.  The theory behind the original 
exact negative binomial test of Robinson and Smyth (2008) presupposed that 
the conditional distibution of the counts given the genewise total was 
unimodal (like a binomial distribution).  This is true for typical 
dispersion values, but is not true for very large dispersion values. 
Hence the original exactTest can give totally innappropriate results when 
the dispersion is very, very large.  The new rejection region is similar 
to the original when the dispersion is smallish, but gives sensible 
results in any situation.  Hence it should be used.  The old rejection 
region is preserved as an option only for backward compatability.

Best wishes
Gordon

> Date: Wed, 06 Mar 2013 19:22:06 -0500
> From: Zhuzhu Zhang <zhuzhuz at email.unc.edu>
> To: bioconductor at r-project.org
> Subject: [BioC] edgeR: new defaults of estimateTagwiseDisp and
> 	exactTest
>
> Dear All,
>
> I'm running edgeR 3.0.8 and notice that the results are considerably
> different than those from older versions. I realized that it was likely
> due to the different prior.df I used in Function estimateTagwiseDisp,
> thanks to an earlier discussion on the list-
> https://stat.ethz.ch/pipermail/bioconductor/2012-December/049644.html
>
> I have a few more questions:
>
> 1. How to choose a proper prior.df (or prior.n)?
>
> 2. How is the new default method of exactText different from the old
> default (rejection.region = "smallp")? How does it improve the performance?
>
> 3. In general, what parameters should I tune for different datasets,
> when using function estimateTagwiseDisp and exactTest? I used the
> defaults but realized that they may not be most appropriate.
>
> Thank you for your time and attention. Any suggestions and comments
> would be extremely helpful and appreciated.
>
> Thanks,
> Zhuzhu
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}