[BioC] edgeR: new defaults of estimateTagwiseDisp and exactTest

Gordon K Smyth smyth at wehi.EDU.AU
Wed Mar 13 00:37:03 CET 2013


On Tue, 12 Mar 2013, Zhuzhu Zhang wrote:

> Dear Gordon,
>
> Thank you very much for your reply. That was greatly helpful.
>
> For prion.df, would you recommend that the user use the default value in the 
> current version, or choose it differently for different datasets since it 
> varies as the data size changes?

Use the default value.  As I tried to explain, the whole point of using 
prior.df instead of prior.n is that the optimal value for prior.df does 
not depend on data size.

> If the latter, would the same principle of 
> choosing prior.n be applied?

prior.n is a function of prior.df.  See ?getPriorN.

Gordon

> Thanks,
> Zhuzhu
>
>
>
>
> On 3/8/13 2:06 AM, Gordon K Smyth wrote:
>> Dear Zhuzhu,
>> 
>> The edgeR User's guide gives case studies of how we intend edgeR to be 
>> used.  You will see that we expect users to generally use the default 
>> parameters.  Although there are a great many parameters that can be 
>> varied in principle in the edgeR functions, we not expect them to be 
>> changed in most analyses.
>> 
>> One exception was the prior.n argument to estimateTagwiseDisp.  It was 
>> never our intention that a fixed value for prior.n would be used for 
>> datasets of different sizes, so previous versions of the User's Guide 
>> used to explain how to set this parameter.  Several versions ago, we 
>> eliminated the prior.n argument and replaced it with prior.df so that 
>> the default behavior was what we intended.
>> 
>> Over time we have reduced the default value for prior.df somewhat. 
>> Larger values of prior.df give more priority to genes with large fold 
>> changes between groups.  Smaller values of prior.df give more priority 
>> to genes with small dispersion values, i.e., to genes that are 
>> consistent between replicates.
>> 
>> You ask about exactTest().  To learn more about the different rejection 
>> regions, you should start by reading the help page for exactTest(), but 
>> I doubt that this is causing you a problem.  The theory behind the 
>> original exact negative binomial test of Robinson and Smyth (2008) 
>> presupposed that the conditional distibution of the counts given the 
>> genewise total was unimodal (like a binomial distribution).  This is 
>> true for typical dispersion values, but is not true for very large 
>> dispersion values. Hence the original exactTest can give totally 
>> innappropriate results when the dispersion is very, very large. The new 
>> rejection region is similar to the original when the dispersion is 
>> smallish, but gives sensible results in any situation.  Hence it should 
>> be used.  The old rejection region is preserved as an option only for 
>> backward compatability.
>> 
>> Best wishes
>> Gordon
>> 
>> 
>>> Date: Wed, 06 Mar 2013 19:22:06 -0500
>>> From: Zhuzhu Zhang <zhuzhuz at email.unc.edu>
>>> To: bioconductor at r-project.org
>>> Subject: [BioC] edgeR: new defaults of estimateTagwiseDisp and
>>>     exactTest
>>> 
>>> Dear All,
>>> 
>>> I'm running edgeR 3.0.8 and notice that the results are considerably 
>>> different than those from older versions. I realized that it was 
>>> likely due to the different prior.df I used in Function 
>>> estimateTagwiseDisp, thanks to an earlier discussion on the list- 
>>> https://stat.ethz.ch/pipermail/bioconductor/2012-December/049644.html
>>> 
>>> I have a few more questions:
>>> 
>>> 1. How to choose a proper prior.df (or prior.n)?
>>> 
>>> 2. How is the new default method of exactText different from the old 
>>> default (rejection.region = "smallp")? How does it improve the 
>>> performance?
>>> 
>>> 3. In general, what parameters should I tune for different datasets, 
>>> when using function estimateTagwiseDisp and exactTest? I used the 
>>> defaults but realized that they may not be most appropriate.
>>> 
>>> Thank you for your time and attention. Any suggestions and comments
>>> would be extremely helpful and appreciated.
>>> 
>>> Thanks,
>>> Zhuzhu

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list