[BioC] DESeq vs DESeq2 have different DEGs results
Simon Anders
anders at embl.de
Wed May 21 09:22:31 CEST 2014
Hi Catalina
On 21/05/14 03:03, Catalina Aguilar Hurtado wrote:
> If I get almost (~8) no DEGs by the reduced: ~subject and I get results
> (879) using reduced:~1 what is wrong with it?
>
> I don’t understand what is wrong if I want to compare my design with the
> null model of no variability.
You seem to seriously misunderstand what the models mean.
Your two model formulas were:
>> full: ~ subject + treament
>> reduced: ~ 1
This way, DESeq2 will report to you all genes that seem to be
significantly affected by any of the factor that are present in the full
model and absent in the reduced model, i.e., subject and treatment.
Hence, you get all genes that differ between subjects _or_ that respond
to treatment. But you don't want to see genes that differ between
subjects without being affected by treatment, and this is why "subject"
has to appear in the reduced model, too:
If you do what Mike suggested
>> full: ~ subject + treatment
>> reduced: ~ subject
you account for "subject" in both models. The difference between models
is only "treatment", i.e. you get the genes that respond to treatment.
I should add that all this is nothing specific two DESeq2, but just
usual linear modelling. Any statistics text-book with a chapter on ANOVA
will help you.
Also, look up the difference between an unpaired and a paired t test. In
RNA-Seq analysis, a simple t test is underpowered, unless you have many
samples, and also not quite appropriate, because count data is not
normally distributed. But as most people are familiar with t tests, it
helps to know that the comparison
full: ~ treatment
reduced: ~ 1
is essentially the same as an unpaired two-sample t test, and
full: ~ subject + treatment
reduced: ~ subject
is the same as a paired t test. Hence, reminding yourself about when and
why one uses paired t test might be illuminating.
Simon
More information about the Bioconductor
mailing list