[BioC] Can DESeq2 find tissue-specific expression genes

Fri Dec 20 10:04:16 CET 2013

Hi

On 20/12/13 04:03, Eman Lee [guest] wrote:
> How can we use DESeq2 to determine which genes are tissue-specific expression?
> Can we take Tissue1 as CASE, other 9 tissues as CONTROL, to find tissue-specific genes (high expression in Tissue1) ?

In principle, yes. However, with this setting, DESeq2 would look for 
genes where Case differs from Control much more than the Control samples 
differ from each other, i.e., extreme cases where a gene has very 
similar expression in all but one tissues, and a very different one in 
this remaining tissue. If a gene sticks out in more than a single tissue 
(e.g., strong in two and weak in eight tissues), you wouldn't find it.

The conventional way would be to do a likelihood ratio test to see 
whether the tissue effect is significant, i.e., compare the models 
"count ~ tissue" against "count ~ 1".

You can then look at the shrunken log fold changes reported by DESeq2 
for the indivdual tissues to find out which tissue(s) are different.

Or you do Wald tests (DESeq2 offers both likelihood ratio tests and Wald 
tests) and use the Wald test p values to find tissues which differ 
significantly from the average for a gene.

In standard linear modelling, you have to assign one of the tissues as 
your "base level". It gets absorbed into the model's intercept and all 
other tissues' expressions are reported relative to it, and the log fold 
changes get shrunken towards it (if you use DESeq2's coefficient 
shrinkage). This is undesirable as it makes one tissue special. To solve 
this, we have, very recently, implemented "expanded design matrices" in 
the devel version of DESeq2, and this might be quite useful for you. 
(The original motivation was also a search for tissue-specific usage, in 
that case of exons; see Reyes et al., PNAS 2013, 110:15377).

   Simon