[BioC] edgeR: new defaults of estimateTagwiseDisp and exactTest
Gordon K Smyth
smyth at wehi.EDU.AU
Wed Mar 13 00:37:03 CET 2013
On Tue, 12 Mar 2013, Zhuzhu Zhang wrote:
> Dear Gordon,
>
> Thank you very much for your reply. That was greatly helpful.
>
> For prion.df, would you recommend that the user use the default value in the
> current version, or choose it differently for different datasets since it
> varies as the data size changes?
Use the default value. As I tried to explain, the whole point of using
prior.df instead of prior.n is that the optimal value for prior.df does
not depend on data size.
> If the latter, would the same principle of
> choosing prior.n be applied?
prior.n is a function of prior.df. See ?getPriorN.
Gordon
> Thanks,
> Zhuzhu
>
>
>
>
> On 3/8/13 2:06 AM, Gordon K Smyth wrote:
>> Dear Zhuzhu,
>>
>> The edgeR User's guide gives case studies of how we intend edgeR to be
>> used. You will see that we expect users to generally use the default
>> parameters. Although there are a great many parameters that can be
>> varied in principle in the edgeR functions, we not expect them to be
>> changed in most analyses.
>>
>> One exception was the prior.n argument to estimateTagwiseDisp. It was
>> never our intention that a fixed value for prior.n would be used for
>> datasets of different sizes, so previous versions of the User's Guide
>> used to explain how to set this parameter. Several versions ago, we
>> eliminated the prior.n argument and replaced it with prior.df so that
>> the default behavior was what we intended.
>>
>> Over time we have reduced the default value for prior.df somewhat.
>> Larger values of prior.df give more priority to genes with large fold
>> changes between groups. Smaller values of prior.df give more priority
>> to genes with small dispersion values, i.e., to genes that are
>> consistent between replicates.
>>
>> You ask about exactTest(). To learn more about the different rejection
>> regions, you should start by reading the help page for exactTest(), but
>> I doubt that this is causing you a problem. The theory behind the
>> original exact negative binomial test of Robinson and Smyth (2008)
>> presupposed that the conditional distibution of the counts given the
>> genewise total was unimodal (like a binomial distribution). This is
>> true for typical dispersion values, but is not true for very large
>> dispersion values. Hence the original exactTest can give totally
>> innappropriate results when the dispersion is very, very large. The new
>> rejection region is similar to the original when the dispersion is
>> smallish, but gives sensible results in any situation. Hence it should
>> be used. The old rejection region is preserved as an option only for
>> backward compatability.
>>
>> Best wishes
>> Gordon
>>
>>
>>> Date: Wed, 06 Mar 2013 19:22:06 -0500
>>> From: Zhuzhu Zhang <zhuzhuz at email.unc.edu>
>>> To: bioconductor at r-project.org
>>> Subject: [BioC] edgeR: new defaults of estimateTagwiseDisp and
>>> exactTest
>>>
>>> Dear All,
>>>
>>> I'm running edgeR 3.0.8 and notice that the results are considerably
>>> different than those from older versions. I realized that it was
>>> likely due to the different prior.df I used in Function
>>> estimateTagwiseDisp, thanks to an earlier discussion on the list-
>>> https://stat.ethz.ch/pipermail/bioconductor/2012-December/049644.html
>>>
>>> I have a few more questions:
>>>
>>> 1. How to choose a proper prior.df (or prior.n)?
>>>
>>> 2. How is the new default method of exactText different from the old
>>> default (rejection.region = "smallp")? How does it improve the
>>> performance?
>>>
>>> 3. In general, what parameters should I tune for different datasets,
>>> when using function estimateTagwiseDisp and exactTest? I used the
>>> defaults but realized that they may not be most appropriate.
>>>
>>> Thank you for your time and attention. Any suggestions and comments
>>> would be extremely helpful and appreciated.
>>>
>>> Thanks,
>>> Zhuzhu
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list