[BioC] DESeq vs DESeq2 have different DEGs results

Wed May 21 09:22:31 CEST 2014

Hi Catalina

On 21/05/14 03:03, Catalina Aguilar Hurtado wrote:
> If I get almost (~8) no DEGs by the reduced: ~subject and I get results
> (879) using reduced:~1 what is wrong with it?
>
> I donâ€™t understand what is wrong if I want to compare my design with the
> null model of no variability.

You seem to seriously misunderstand what the models mean.

Your two model formulas were:

 >> full: ~ subject + treament
 >> reduced: ~ 1

This way, DESeq2 will report to you all genes that seem to be 
significantly affected by any of the factor that are present in the full 
model and absent in the reduced model, i.e., subject and treatment.

Hence, you get all genes that differ between subjects _or_ that respond 
to treatment. But you don't want to see genes that differ between 
subjects without being affected by treatment, and this is why "subject" 
has to appear in the reduced model, too:

If you do what Mike suggested

 >> full: ~ subject + treatment
 >> reduced: ~ subject

you account for "subject" in both models. The difference between models 
is only "treatment", i.e. you get the genes that respond to treatment.

I should add that all this is nothing specific two DESeq2, but just 
usual linear modelling. Any statistics text-book with a chapter on ANOVA 
will help you.

Also, look up the difference between an unpaired and a paired t test. In 
RNA-Seq analysis, a simple t test is underpowered, unless you have many 
samples, and also not quite appropriate, because count data is not 
normally distributed. But as most people are familiar with t tests, it 
helps to know that the comparison

   full: ~ treatment
   reduced: ~ 1

is essentially the same as an unpaired two-sample t test, and

   full: ~ subject + treatment
   reduced: ~ subject

is the same as a paired t test. Hence, reminding yourself about when and 
why one uses paired t test might be illuminating.

   Simon