[BioC] Testing for no change in RNA-seq data?

Thu Mar 21 14:42:32 CET 2013

Hi Ryan

On 21/03/13 01:05, Ryan C. Thompson wrote:
> So, we have several great packages for testing for evidence of
> differential expression using RNA-seq counts, and I am happily using
> those. However, I would like to know if there is any method for testing
> for evidence of equivalent expression. That is, I would like to show
> with some level of confidence that treatment X has no (or negligible)
> effect on gene Y. Is there any way to conduct such a test?

One of the new features of DESeq2 is that it provides a empirical-Bayes 
style shrinkage estimates of coefficients (i.e., log fold changes) and 
also estimates standard errors for these coefficients (taken from the 
the reciprocal curvature of the posterior). Your task is one of the 
application we had in mind for this.

You want to know whether the fold change is zero or nearly zero, and you 
also want to be ascertained that this estimate of a nearly-zero fold 
change is a precise one. So, to find genes whose absolute log fold 
change is _reliably_ smaller than some threshold, take all those genes 
for which the estimated abs log fold change is below the threshold and 
the standard error is well below the threshold, too.

The reason, why I write "below a threshold" rather than "equal to zero" 
is that it's biologically unreasonable to assume that a treatment has 
exactly zero effect on a gene's expression. Gene regulation is such an 
interconnected network that, at least in my view, every gene will react 
ever so slightly to every perturbance, but often, this reaction is too 
small to be measurable. This is why you should define a threshold for 
"biologically unlikely to be relevant". Let's say, we don't believe that 
an expression change of less than 15% (0.2 on a log2 scale) is worth 
bothering. So, if you might take all genes with abs log2 fold change 
below 0.2, and furthermore require that the standard error of this log2 
fold change estimate is below, say, 0.05, than you will get gene with 
true values well below at least 0.3 or so.

In the end, the thresholds won't matter too much but if you want real p 
values, this is possible, too: Simply divide the log2 fold change 
estimates by their standard errors to get z scores (this works because 
the sampling distribution of our fold change estimates is reasonably 
close to normal), and do two one-sided tests (TOST) as suggested by Ryan 
above.

Actually, as this may come up more often, we should maybe explain the 
details of using this approach in the vignette. Ask again if you want to 
see some code.

  Simon