[BioC] limma: correcting for a correlation between contrasts
Alex Gutteridge
alexg at ruggedtextile.com
Thu Jul 11 11:18:16 CEST 2013
I have an experimental design (or rather question) that goes beyond my
usual limma knowledge. I am hoping there is some elegant way to answer
it that someone here can point me towards. The experimental design is
straightforward: one control group (Ctrl) and two treatment groups (T1 &
T2) with three replicates each:
S1 Ctrl
S2 Ctrl
S3 Ctrl
S4 T1
S5 T1
S6 T1
S7 T2
S8 T2
S9 T2
It is clear from some basic exploratory analysis that the effect of T2
is very similar to T1, but has a slightly reduced response. I.e.
plotting log fold changes for the Ctrl - T1 contrast vs the Ctrl - T2
contrast gives a clear linear correlation but with slope <1. This is
expected given what we know about the biology of the system, but we
suspect there are some genes that will be specific for the T2 treatment.
My first instinct is to setup a contrast for limma like: (Ctrl - T1) -
(Ctrl - T2), but because the T2 response is across the board less than
T1 this ends up just highlighting that fact rather than pulling out the
T2 'specific' response.
Hopefully this code explains things visually with some simulated data:
T1.response = rnorm(100,0,1)
T2.response = T1.response * 0.5 + rnorm(100,0,0.2)
T1.response[101:110] = rnorm(10,0,0.2)
T2.response[101:110] = rnorm(10,1,0.2)
plot(T1.response,T2.response,xlim=c(-2,2),ylim=c(-2,2),col=c(rep("green",100),rep("black",10)),pch=16)
abline(0,1,col="blue")
abline(-0.5,1,lty=2,col="blue")
abline(0.5,1,lty=2,col="blue")
abline(model,col="red")
abline(model$coefficients[1]+0.5,model$coefficients[2],lty=2,col="red")
abline(model$coefficients[1]-0.5,model$coefficients[2],lty=2,col="red")
The (Ctrl - T1) - (Ctrl - T2) contrast tends to pull out the T2
specific genes (black) which I want, but also those genes outside the
blue tramlines that are simply not responding as strongly to T2 as T1
(which I don't want). In the real data, those genes out-number the
really T2 specific genes considerably and dominate in any downstream
GSEA type analysis. Is there a way to setup the contrast instead so
limma corrects for the correlation between the two datasets and looks
for genes with large residuals from the red (fitted) line rather than
the blue?
--
Alex Gutteridge
More information about the Bioconductor
mailing list