[BioC] Testing for no change in RNA-seq data?

Simon Anders anders at embl.de
Fri Mar 22 13:08:00 CET 2013


Hi Ryan

On 22/03/13 00:11, Ryan C. Thompson wrote:
> I got as far as computing the z-scores, but I'm not sure how to go from
> there to p-values for a test of equivalence.

I guess I made it sound more trivial than it is. The "two one-sided 
tests" (TOST) scheme already mentioned in this thread is originally due 
to Schuirmann (Journal of Pharmacokinetics and Pharmacodynamics, 1987) 
-- or rather, the acronym is due to him, the idea is probably older, and 
he just pointed out that this what one should do in situations such as 
yours.

To recap, you want to test the null hypothesis that your log fold change 
beta is larger than some threshold theta, i.e., you want to find genes 
for which you can reject this null hypothesis and hence say with some 
certainty that |beta| < theta. The TOST scheme suggests, as its name 
implies to perform two pone sided tests, one with the null hypothesis 
H0_A: beta > theta, the other with H0_B: beta < -theta, and then use the 
larger of the two p values.

I demonstrate this below with some code.

# Let's use the example data from the pasilla package
library( DESeq2 )
library( pasilla )
data( "pasillaGenes" )

# Create a DESeq2 data object from the pasilla data
dse <- DESeqSummarizedExperimentFromMatrix(
    counts(pasillaGenes), pData(pasillaGenes)[,c("condition","type")],
    ~ type + condition )

# Perform a standard DESeq2 analysis
dse <- DESeq(dse)

# The log2 fold changes are found here
beta <- results(dse)$log2FoldChange

# Just to make the following clearer, I should point out that the
# "results" accessor is just a short-cut for this access to the rowData:
all( beta == mcols(rowData(dse))$conditionuntreated, na.rm=TRUE )
# (returns TRUE)

# The log fold change estimates all come with standard error information
# which we find in the rowData (maybe we should copy this to the
# 'results', too)
betaSE <- mcols(rowData(dse))$SE_conditionuntreated

# Internally, the Wald test is implemented as a simple two-sided
# z test of beta/betaSE. Two demonstrate this, we to the test
# manually and compare
pvalDE <- 2 * pnorm( abs( beta ), sd = betaSE, lower.tail=FALSE )
all( abs( pvalDE - results(dse)$pvalue ) < 1e-15, na.rm=TRUE )
# (returns TRUE)

# This was the test for DE, of course, i.e., small pvalDE means that
# the gene's expression change (the true value of beta) is not zero

# What we want is the opposite, namely find gene, for which abs(beta)
# is smaller than some threshold, theta
theta <- .3

# So, we do our two one-sided tests. For a one-sided z test, we
# simply use tail probabilities from the normal distribution.

# First, the test of H0_A: true_beta > thr
pA <- pnorm( beta, thr, betaSE, lower.tail=TRUE )

# Next, the test of H0_B: true_beta < -thr
pB <- pnorm( beta, -thr, betaSE, lower.tail=FALSE )

# The overall p value is the maximum, because we want to reject H0_A
# and H0_B simultaneously
pvalTOST <- pmax( pA, pB )


# Let's adjust our two p values with BH:
sigDE <- p.adjust( pvalDE, "BH" ) < .1
sigSmall <- p.adjust( pvalTOST, "BH" ) < .1

# And make an MA plot, with sigDE in red and sigSmall in green
plot(
    rowMeans( counts(dse,normalized=TRUE) ), beta,
    log="x", pch=20, cex=.2,
    col = 1 + sigDE + 2*sigSmall )
# Plot is attached.

I think, we will include this code in one of the coming releases of DESeq2.


   Simon

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplot001.png
Type: image/png
Size: 46596 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20130322/ed095934/attachment.png>


More information about the Bioconductor mailing list