[BioC] A few Q's on using DEXSeq with mucho data

Mon Mar 12 03:06:58 CET 2012

Hi Simon,

Just wanted to say thanks for taking the time to provide these
detailed responses.

Cheers,
-steve

On Sun, Mar 11, 2012 at 7:04 AM, Simon Anders <anders at embl.de> wrote:
> Hi Steve
>
> On 2012-03-08 19:31, Steve Lianoglou wrote:
> [...]
>
>> I was trying to convey my concern (maybe wrong) that the "fitted
>> dispersion" we end up using is a function of the mean read count for
>> the bin, where the mean is calculated from the expression of the gene
>> across all samples/conditions.
>>
>> It could be that in one particular two-experiment comparison I want to
>> make, the expression of the exon is quite high in both samples. In
>> this case, the higher "averaged normalzed count value" (x-axis of fig
>> 2 in your pre-paper) would likely be associated w/ a lower dispersion
>> when doing the test therefore increasing our power.
>>
>> It could be, however, that in the rest of the conditions the gene (and
>> therefore the exon) would be expressed at a lower level, and the
>> dispersion for that bin would then be estimated at a higher amount,
>> decreasing the power in this case.
>
> [...]
>
> You are right, this could be an argument in favour of subsetting before
> dispersion estimation. I am not quite sure how important this effect is in
> practice, though.
>
> Bear in mind that the dispersion does not contain the Poisson noise. For
> mean µ and a dispersion α, the variance is v = µ + α µ², and the coefficient
> of variance (CV) squared is CV² = v/µ² = 1/µ + α.
>
> Hence, the dominant term for the dependence of variance and hence power on
> mean is the Poisson term 1/µ, and not so much any remaining dependence of α
> on µ. A negative binomial generalized linear model takes this into account:
> it uses the mean-variance relation
> v = µ + α µ², with α, not v, considered constant across the model, precisely
> because this handles well the effect on variance of differences in overall
> mean in the different treatment groups.
>
> Nevertheless, as not only CV² but also α itself seems to decrease with µ,
> this is not perfect.
>
>  Simon

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact