[BioC] edgeR:Differences in results between two different versions of edgeR

Gordon K Smyth smyth at wehi.EDU.AU
Wed Dec 5 00:35:38 CET 2012


Dear Dorota,

The important settings are prior.df and trend.

prior.n and prior.df are related through prior.df = prior.n * residual.df, 
and your experiment has residual.df = 36 - 12 = 24.  So the old setting of 
prior.n=10 is equivalent for your data to prior.df = 240, a very large 
value.  Going the other way, the new setting of prior.df=10 is equivalent 
to prior.n=10/24.

To recover old results with the current software you would use

  estimateTagwiseDisp(object, prior.df=240, trend="none")

To get the new default from old software you would use

  estimateTagwiseDisp(object, prior.n=10/24, trend=TRUE)

Actually the old trend method is equivalent to trend="loess" in the new 
software. You should use plotBCV(object) to see whether a trend is 
required.

Note you could also use

  prior.n <- getPriorN(object, prior.df=10)

to map between prior.df and prior.n.

There has also been a change in the default behaviour of exactTest().  To 
make the new exactTest() behave like the old version, you would use

   exactTest(object, rejection.region="smallp")

The new default gives much more reliable results than the old when the 
dispersion is very large.

Best wishes
Gordon

> Date: Mon, 03 Dec 2012 19:36:58 +0100
> From: "Dorota Herman" <dorota.herman at psb.vib-ugent.be>
> To: Bioconductor mailing list <bioconductor at r-project.org>
> Subject: [BioC] edgeR:Differences in results between two different
> 	versions	of edgeR
>
> Dear list,
>
> when I run the same code for RNA-seq data to find differentially 
> expressed genes using exactTest() in two different versions of edgeR, I 
> obtain considerable different results. The data set contains 36 
> libraries divided into 12 groups, where each library is consist of 24 
> 000 genes (none of them has all zero counts). While the older version 
> (edgeR_2.0.5) gives me 97 significantly differentially expressed genes 
> between two selected groups, the newer version (edgeR_3.0.4) does not 
> find any significantly differentially expressed genes; moreover FDR is 
> less than 1 only for 13 genes. I realize these two versions are far from 
> each other in their developmental process. However, I would be still 
> interested in reasons of such a difference.
>
> Running in parallel the same code in two different versions of edgeR, I 
> find out that it is most likely attributed by the estimateTagwiseDisp() 
> function, which are
>
> estimateTagwiseDisp(object, prior.n=10, trend=FALSE, prop.used=NULL, 
> tol=1e-06, grid=TRUE, grid.length=200, verbose=TRUE) in edgeR_2.0.5
>
> and
>
> estimateTagwiseDisp(object, prior.df=20, trend="movingave", span=NULL, 
> method="grid", grid.length=11, grid.range=c(-6,6), tol=1e-06, 
> verbose=FALSE) in edgeR_3.0.4
>
> The greatest impact seems to have parameters prior.n prior.df as their 
> settings say how much we want our tagwise dispersion be influenced by a 
> common dispersion. Although setting a prior.df to very low (that would 
> be an equivalent of a high prior.n) makes a difference in FDR values, 
> the results from two different edgeR versions are still very distinct, 
> so are estimated $tagwise.disperion parameters . Another candidate 
> parameter for changes seems to be the prop.used but I am not sure if its 
> equivalent in edgeR_3.0.4 is ?span? parameter, is it? On the other hand 
> there are parameters related to the estimation algorithm, that I would 
> not expect to cause such a difference in the further outcome, could 
> they?
>
> What am I missing here? Settings of which parameter would make outcomes 
> of DE genes analyses more comparable between two different edgeR 
> versions?
>
> Best wishes
> Dorota
>
>
> ==================================================================
> Dorota Herman, PhD
> VIB Department of Plant Systems Biology, Ghent University
> Technologiepark 927
> 9052 Gent, Belgium
> Tel: +32 (0)9 3313692
> Email:dorota.herman at psb.vib-ugent.be
> Web: http://www.psb.ugent.be

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list