[BioC] DEXSeq update results change
António Miguel de Jesus Domingues
amjdomingues at gmail.com
Thu Aug 21 11:57:45 CEST 2014
Hi Alejandro,
thanks again for looking into this.
> I had a look at your data, apparently the difference in dispersion
> estimates between the old and the new versions of DEXSeq can make a
> difference in the coefficients of the GLM, therefore the exon fold
> changes. But this changes seem to be specifically affecting only those
> exons with very low counts.
This is very re-assuring and makes sense. The new version is teh way to go
then :)
Best regards,
António
> For example, with the objects that you send me:
>
> select <- rowSums( dxr$countData ) > 10
> plot( dxr_new$`log2fold_3_c_GFP_c`[select], dxr_old$`log2fold.3_c_c.GFP_c_c.`[select]
> )
>
> These numbers/plots give a much more reasonable picture. These differences
> are from those exons where noise is predominant. I will dig more into this,
> but I would not worry so much about it, the signs for the significant exons
> are anyway consistent:
>
> select2 <- which(dxr_old$padjust < 0.1)
> table( dxr_new$`log2fold_3_c_GFP_c`[select2] > 0 ,
> dxr_old$`log2fold.3_c_c.GFP_c_c.`[select2] > 0)
>
> FALSE TRUE
> FALSE 1630 0
> TRUE 0 614
>
> Best regards,
> Alejandro
>
>
>
>
> Dear Wolfgang and Alejandro,
>>
>> First of all, thank you for looking into this.
>>
>> can you send one ore more specific examples, i.e.
>> - the count table for the affected gene(s), for all its exons,
>> and/or the plotDEXSeq output
>> - the size factorss
>>
>>
>> I have prepared a data set+script for testing that will follow in a
>> separate private email, so that you can look into this in detail. While
>> preparing it I think I spotted where the difference in results might
>> originate *(1)*.
>>
>>
>> Let me clarify that my concern is not with a particular exon, but rather
>> with the general trend (ratio of up-regulated / down-regulated exons) that
>> is changed, particularly in the experimental set-up I am sending you.
>>
>> That also leads to the second point - with only two replicates per
>> condition, expectations about reproducibility of the result should
>> be modest. No amount of statistical software can undo that.
>>
>>
>> I am well aware of that :) In defence of data, I should say that the
>> experimental validation of the DGE results (for this same data) was nearly
>> 100%. So yes, few replicates can be an issue, but we have some experimental
>> validation to give us assurance that not all is bad.
>>
>> @ Alejandro
>>
>> Just an additional question, do you see the shift in fold changes
>> for all your exons or only for a subset of them?
>> In older versions there was a bug that was causing some label
>> swaps in the result columns, but this should be fixed in the most
>> recent versions (I just want to make sure it is fixed!). As
>> Wolfgang mentions, this would become evident by looking at the
>> plotDEXSeq output (by looking at the normalized counts and exon
>> usage).
>>
>>
>>
>> The scatter plot of fold change of new vs old version is a bit funky I
>> must say:
>> https://www.dropbox.com/s/l3snr4epgwbkty8/foldchange_comparison.png
>>
>>
>> *(1) *
>>
>> while playing with the example data to send you, I noticed what could be
>> an explanation while counting significantly changed exons:
>>
>> https://www.dropbox.com/s/7zc4n352ftjzqqe/nHits_comparison.pdf
>>
>> In the old version of DEXseq without a fold-change cut-off, there are
>> more exons with decreased inclusion than with increased inclusion
>> (~2500/1500 exons). With increasingly higher fold-change cut-offs this is
>> inverted. For instance with fc 10% is 2000/1500, and with fc of 50% is
>> 80/400. So a completely different trend. Using the new DEXSeq version,
>> changing the FC cut-off makes no difference: the trend is always more exons
>> with increased inclusion, which is sort of what I would expect.
>>
>> Could it be that the old version is less efficient in estimating the
>> fold-changes when the differences are minor. Well, not estimating
>> fold-changes but rather the dispersions. That would explain the differences
>> I observed. And we only have 2 replicates so we cannot expect miracles from
>> DEXSeq.
>>
>> Best regards,
>> António
>>
>>
>> On 16 August 2014 12:24, Wolfgang Huber <whuber at embl.de <mailto:
>> whuber at embl.de>> wrote:
>>
>> Dear Antonio
>>
>> can you send one ore more specific examples, i.e.
>> - the count table for the affected gene(s), for all its exons,
>> and/or the plotDEXSeq output
>> - the size factorss
>>
>> This should help all of us understand better, and perhaps fix,
>> what you’re unhappy about.
>> What DEXSeq does is not a black box, it is in fact very simple, so
>> we should be able to get to the bottom of this.
>>
>> Regarding the question in the second paragraph: if you have reason
>> to assume that the biological variability is the same in all your
>> conditions (knockdowns), then the joint dispersion estimation will
>> be more precise. But it is not biologically implausible that the
>> assumption may be wrong (e.g. because of the different efficiency
>> of RNAi), leading to underestimating of the true biological
>> variability (and there over-calling of results) in some conditions.
>>
>> That also leads to the second point - with only two replicates per
>> condition, expectations about reproducibility of the result should
>> be modest. No amount of statistical software can undo that.
>>
>> Best wishes
>> Wolfgang
>>
>>
>>
>> --
>> --
>> António Miguel de Jesus Domingues, PhD
>> Postdoctoral researcher
>> Deep Sequencing Group - SFB655
>> Biotechnology Center (Biotec)
>> Technische Universität Dresden
>> Fetscherstraße 105
>> 01307 Dresden
>>
>> Phone:+49 (351) 458 82362 <tel:%2B49%20%28351%29%20458%2082362>
>> Email: antonio.domingues(at)biotec.tu-dresden.de <
>> http://biotec.tu-dresden.de>
>>
>> --
>> The Unbearable Lightness of Molecular Biology
>>
>
>
--
--
António Miguel de Jesus Domingues, PhD
Postdoctoral researcher
Deep Sequencing Group - SFB655
Biotechnology Center (Biotec)
Technische Universität Dresden
Fetscherstraße 105
01307 Dresden
Phone: +49 (351) 458 82362
Email: antonio.domingues(at)biotec.tu-dresden.de
--
The Unbearable Lightness of Molecular Biology
[[alternative HTML version deleted]]
More information about the Bioconductor
mailing list