[BioC] Testing for no change in RNA-seq data?
Ryan C. Thompson
rct at thompsonclan.org
Fri Mar 22 17:32:07 CET 2013
Thanks, Simon, that makes it very clear.
On 03/22/2013 05:08 AM, Simon Anders wrote:
> Hi Ryan
>
> On 22/03/13 00:11, Ryan C. Thompson wrote:
>> I got as far as computing the z-scores, but I'm not sure how to go from
>> there to p-values for a test of equivalence.
>
> I guess I made it sound more trivial than it is. The "two one-sided
> tests" (TOST) scheme already mentioned in this thread is originally
> due to Schuirmann (Journal of Pharmacokinetics and Pharmacodynamics,
> 1987) -- or rather, the acronym is due to him, the idea is probably
> older, and he just pointed out that this what one should do in
> situations such as yours.
>
> To recap, you want to test the null hypothesis that your log fold
> change beta is larger than some threshold theta, i.e., you want to
> find genes for which you can reject this null hypothesis and hence say
> with some certainty that |beta| < theta. The TOST scheme suggests, as
> its name implies to perform two pone sided tests, one with the null
> hypothesis H0_A: beta > theta, the other with H0_B: beta < -theta, and
> then use the larger of the two p values.
>
> I demonstrate this below with some code.
>
> # Let's use the example data from the pasilla package
> library( DESeq2 )
> library( pasilla )
> data( "pasillaGenes" )
>
> # Create a DESeq2 data object from the pasilla data
> dse <- DESeqSummarizedExperimentFromMatrix(
> counts(pasillaGenes), pData(pasillaGenes)[,c("condition","type")],
> ~ type + condition )
>
> # Perform a standard DESeq2 analysis
> dse <- DESeq(dse)
>
> # The log2 fold changes are found here
> beta <- results(dse)$log2FoldChange
>
> # Just to make the following clearer, I should point out that the
> # "results" accessor is just a short-cut for this access to the rowData:
> all( beta == mcols(rowData(dse))$conditionuntreated, na.rm=TRUE )
> # (returns TRUE)
>
> # The log fold change estimates all come with standard error information
> # which we find in the rowData (maybe we should copy this to the
> # 'results', too)
> betaSE <- mcols(rowData(dse))$SE_conditionuntreated
>
> # Internally, the Wald test is implemented as a simple two-sided
> # z test of beta/betaSE. Two demonstrate this, we to the test
> # manually and compare
> pvalDE <- 2 * pnorm( abs( beta ), sd = betaSE, lower.tail=FALSE )
> all( abs( pvalDE - results(dse)$pvalue ) < 1e-15, na.rm=TRUE )
> # (returns TRUE)
>
> # This was the test for DE, of course, i.e., small pvalDE means that
> # the gene's expression change (the true value of beta) is not zero
>
> # What we want is the opposite, namely find gene, for which abs(beta)
> # is smaller than some threshold, theta
> theta <- .3
>
> # So, we do our two one-sided tests. For a one-sided z test, we
> # simply use tail probabilities from the normal distribution.
>
> # First, the test of H0_A: true_beta > thr
> pA <- pnorm( beta, thr, betaSE, lower.tail=TRUE )
>
> # Next, the test of H0_B: true_beta < -thr
> pB <- pnorm( beta, -thr, betaSE, lower.tail=FALSE )
>
> # The overall p value is the maximum, because we want to reject H0_A
> # and H0_B simultaneously
> pvalTOST <- pmax( pA, pB )
>
>
> # Let's adjust our two p values with BH:
> sigDE <- p.adjust( pvalDE, "BH" ) < .1
> sigSmall <- p.adjust( pvalTOST, "BH" ) < .1
>
> # And make an MA plot, with sigDE in red and sigSmall in green
> plot(
> rowMeans( counts(dse,normalized=TRUE) ), beta,
> log="x", pch=20, cex=.2,
> col = 1 + sigDE + 2*sigSmall )
> # Plot is attached.
>
> I think, we will include this code in one of the coming releases of
> DESeq2.
>
>
> Simon
>
More information about the Bioconductor
mailing list