[BioC] edgeR: new defaults of estimateTagwiseDisp and exactTest

Tue Mar 12 21:14:58 CET 2013

Dear Gordon,

Thank you very much for your reply. That was greatly helpful.

For prion.df, would you recommend that the user use the default value in 
the current version, or choose it differently for different datasets 
since it varies as the data size changes? If the latter, would the same 
principle of choosing prior.n be applied?

Thanks,
Zhuzhu

On 3/8/13 2:06 AM, Gordon K Smyth wrote:
> Dear Zhuzhu,
>
> The edgeR User's guide gives case studies of how we intend edgeR to be 
> used.  You will see that we expect users to generally use the default 
> parameters.  Although there are a great many parameters that can be 
> varied in principle in the edgeR functions, we not expect them to be 
> changed in most analyses.
>
> One exception was the prior.n argument to estimateTagwiseDisp.  It was 
> never our intention that a fixed value for prior.n would be used for 
> datasets of different sizes, so previous versions of the User's Guide 
> used to explain how to set this parameter.  Several versions ago, we 
> eliminated the prior.n argument and replaced it with prior.df so that 
> the default behavior was what we intended.
>
> Over time we have reduced the default value for prior.df somewhat. 
> Larger values of prior.df give more priority to genes with large fold 
> changes between groups.  Smaller values of prior.df give more priority 
> to genes with small dispersion values, i.e., to genes that are 
> consistent between replicates.
>
> You ask about exactTest().  To learn more about the different 
> rejection regions, you should start by reading the help page for 
> exactTest(), but I doubt that this is causing you a problem.  The 
> theory behind the original exact negative binomial test of Robinson 
> and Smyth (2008) presupposed that the conditional distibution of the 
> counts given the genewise total was unimodal (like a binomial 
> distribution).  This is true for typical dispersion values, but is not 
> true for very large dispersion values. Hence the original exactTest 
> can give totally innappropriate results when the dispersion is very, 
> very large. The new rejection region is similar to the original when 
> the dispersion is smallish, but gives sensible results in any 
> situation.  Hence it should be used.  The old rejection region is 
> preserved as an option only for backward compatability.
>
> Best wishes
> Gordon
>
>
>> Date: Wed, 06 Mar 2013 19:22:06 -0500
>> From: Zhuzhu Zhang <zhuzhuz at email.unc.edu>
>> To: bioconductor at r-project.org
>> Subject: [BioC] edgeR: new defaults of estimateTagwiseDisp and
>>     exactTest
>>
>> Dear All,
>>
>> I'm running edgeR 3.0.8 and notice that the results are considerably
>> different than those from older versions. I realized that it was likely
>> due to the different prior.df I used in Function estimateTagwiseDisp,
>> thanks to an earlier discussion on the list-
>> https://stat.ethz.ch/pipermail/bioconductor/2012-December/049644.html
>>
>> I have a few more questions:
>>
>> 1. How to choose a proper prior.df (or prior.n)?
>>
>> 2. How is the new default method of exactText different from the old
>> default (rejection.region = "smallp")? How does it improve the 
>> performance?
>>
>> 3. In general, what parameters should I tune for different datasets,
>> when using function estimateTagwiseDisp and exactTest? I used the
>> defaults but realized that they may not be most appropriate.
>>
>> Thank you for your time and attention. Any suggestions and comments
>> would be extremely helpful and appreciated.
>>
>> Thanks,
>> Zhuzhu
>>
>
> ______________________________________________________________________
> The information in this email is confidential and inte...{{dropped:6}}