[BioC] DESeq variance question
Gordon K Smyth
smyth at wehi.EDU.AU
Mon Dec 5 00:28:25 CET 2011
Dear Simon and Steffen,
> Date: Sat, 03 Dec 2011 20:36:10 +0100
> From: Simon Anders <anders at embl.de>
> To: Steffen Priebe <Steffen.Priebe at hki-jena.de>,
> bioconductor at r-project.org
> Subject: Re: [BioC] DESeq variance question
>
> Dear Steffen
>
> On 2011-12-02 13:53, Steffen Priebe wrote:
>> I was using DESeq (and edgeR) for differentially expression analysis.
>> In my current dataset I compare 3 biological replicates of control vs.
>> 3 biol. replicates from a mutant. The resulting 4 top genes according
>> adjusted pvalue by DESeq and edgeR have a very high variance. (The
>> reason for this is, that this are genes located on the chrY and only
>> one replicate of the mutant was male)
>>
>> My question is now, how can genes with such a high variance of the
>> counts result in this small pvalues? Is there any way to avoid this,
>> because I think this are False Positives?
>>
>> Attached you can find the combined result table of DESeq and edgeR for
>> the top 100 genes. The problem occurs for the first 4 genes. The raw
>> counts are stated in columns P-U (P-R: Mutant, T-U Control).
...
> EdgeR, with its empirical Bayesian approach (implemented in its function
> 'estimateTagwiseDispersion') should typically give p values in the
> middle between DESeq's result using the 'maximum' and its the
> 'fitted-only' sharing modes. However, at least in your case, edgeR
> seemed to have stayed too close to the fitted values (or: to the 'common
> dispersion', in edgeR's terminology)
Common dispersion is not edgeR terminology for DESeq's "fitted values",
and (in the current Bioconductor release) edgeR moderates towards a local
prior rather towards the common dispersion. By default, edgeR does not
fit any model to the dispersion, hence does not have fitted values.
Instead it uses a prior based on locally weighted likelihood.
> as you wrote it also gave you p values for your high-variance genes that
> you considered implausibly low.
>
> Simon
We don't actually know whether tagwise dispersion was used in the edgeR
analysis, nor have we seen the gene list (at least I haven't). In the
absence of the knowing either the analysis or the results, it would seem
premature to make conclusions about the behaviour of estimateTagwiseDisp.
Best wishes
Gordon
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list