[BioC] help with limma design,contrast matrix please
James W. MacDonald
jmacdon at med.umich.edu
Thu Jan 26 18:17:34 CET 2012
Hi John,
On 1/25/2012 8:21 AM, John Alexander wrote:
> Hi all,
> I'd really appreciate if someone can explain to me whether I've taken the following steps correctly. I'm trying to find the difference in gene expression between (wt vs mutant) for illumina mouse data(MouseWG-6 v2.0 chip). I used a design and contrast matrix (although i'm not sure whether i really need the contrast matrix).
>
> Script :
>
> exprs(BSD.quantile)->matrixExprs
> design<-model.matrix(~-1+factor(c(1,1,1,1,1,2,2,2,2,2,2)))
> colnames(design)<-c("group1","group2")
> fit<-lmFit(matrixExprs,design)
> contrast.matrix<-makeContrasts(group2-group1,levels=design)
> fit2<-contrasts.fit(fit,contrast.matrix)
> fit2<- eBayes(fit2)
> topTable(fit2,coef=1,adjust="BH")
>
>
>> pData(BSL)
> sampleID
> 23 wt 23 wt
> 24 wt 24 wt
> 48 wt 48 wt
> 61 wt 61 wt
> 71 wt 71 wt
> 8-Het 8-Het
> 28-Het 28-Het
> 54-Het 54-Het
> 59-Het 59-Het
> 79-Het 79-Het
> 87-Het 87-Het
>
> I created my design matrix as below (group1 -wt,group2-het):
> group1 group2
> 1 1 0
> 2 1 0
> 3 1 0
> 4 1 0
> 5 1 0
> 6 0 1
> 7 0 1
> 8 0 1
> 9 0 1
> 10 0 1
> 11 0 1
> attr(,"assign")
> [1] 1 1
> attr(,"contrasts")
> attr(,"contrasts")$`factor(c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2))`
> [1] "contr.treatment"
>
> Contrasts
> Levels group2 - group1
> group1 -1
> group2 1
>
>
> I'm hoping this is correct, i'd appreciate if someone could give me a bit more understanding. I'm quite confused reading the limma guide.
You did everything correctly. As you note, you could have done things
without a contrasts matrix if you had parameterized your model
differently. As an example,
> model.matrix(~factor(rep(1:2, each=5)))
(Intercept) factor(rep(1:2, each = 5))2
1 1 0
2 1 0
3 1 0
4 1 0
5 1 0
6 1 1
7 1 1
8 1 1
9 1 1
10 1 1
You can figure out what the coefficients in your model are, by noting
that each row of the design matrix corresponds to the model for that
sample.
For instance, in your original design matrix, you had
1 0
in the first row, which corresponds to your WT samples. So this implies
that the model you are fitting is
WT sample = 1(something) + 0(other thing)
And for row 6-10, it was
0 1
indicating
Het sample = 0(something) + 1(other thing)
Knowing that we are computing means, we can then deduce that 'something'
== means of WT samples, and 'other thing' == means of Het samples. Thus,
to make comparisons you have to use a contrast matrix to do the
subtraction explicitly.
In my example, the first 5 rows are obvious. We are computing the mean
of the WT samples. However, the rows 6-10 are mysterious. Why two 1s?
We know that the samples for these rows are Het, and the first
coefficient is WT, so the model is
Het sample = 1(WT) + 1(wazzat)
But simple algebra tells us that this is equivalent to
Het sample - WT = wazzat
Right?
So the second coefficient in my model is explicitly computing the
difference between Het and WT, which is what you want. And you can get
it without a contrasts matrix. Just do
fit <- lmFit(matrixExprs, design)
fit2 <- eBayes(fit)
and then the coefficient of interest is coef=2. Try it and see.
Best,
Jim
>
> cheers,
> john
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
More information about the Bioconductor
mailing list