[BioC] how does limma calculate the coefficients with using different design matrix?

Tue Aug 17 01:27:40 CEST 2010

Dear Xiaokuan,

Your example simply shows that the two design matrices are different, as 
they naturally are.  You haven't actually used contrasts.fit() at all. 
Do you understand how it is intended to be used?

You need something like this:

fit<-lmFit(dat,designcontr)
fit <- contrasts.fit(fit,cont.matrix_contr)
topTable(eBayes(fit))

and

fit<-lmFit(dat,design)
fit <- contrasts.fit(fit,cont.matrix)
topTable(eBayes(fit))

When I do run the above code, I get exactly the same results from the two 
approaches, just as it supposed to happen.

Best wishes
Gordon

On Mon, 16 Aug 2010, Xiaokuan Wei wrote:

Dear Gordon,

Thank you for your detailed explaination. 
In fact, my data don't have missing values and in function of lmFit, I didn't 
use weights argument.
I dig deeper for this issue trying to see what the coefficients are after lmFit.

When use designcontr: 
fit<-lmFit(dat,designcontr)
tp1<-topTable(eBayes(fit))
tp1<-topTable(eBayes(fit),n=Inf)
tp1[tp1$ID=='feature1',]
             ID       g0          g1       g2      g3       g4        g5       
g6      g7     g8      g9  AveExpr        F
1 feature1 11.32317 -0.07976296 2.900216 3.06401 3.210565 0.2295532 1.545972 
2.433364 2.62326 2.719911 13.18788 111975.0
          P.Value    adj.P.Val
1 5.284493e-27 5.296354e-25

When use design:
fit<-lmFit(dat,design)
tp2<-topTable(eBayes(fit),n=Inf)
tp2[tp2$ID=='feature1',]
             ID       g0       g1       g2       g3       g4       g5       
g6      g7      g8      g9  AveExpr        F
1 feature1 11.32317 11.24341 11.55272 12.86914 13.75653 13.94643 14.04308 
14.22338 14.38718 14.53373 13.18788 111975.0
          P.Value    adj.P.Val
1 5.284493e-27 5.296354e-25

So, as you said, if I use contrasts.fit with designcontr and cont.matrix_contr, 
I get exact the coefficents of g1 to g9 for coef1 to coef9.
but when I used design and contr.matrix, only the coef1 is 
g1-g0=11.24341-11.32317=-0.07976 which is the same vaule as using designcontr 
and cont.matrix_contr. however, the other coefficents from coef2 to 
coef9 ARE the difference between each g[i] with g0. but the results are 
DIFFERENT from using designcontr and cont.matrix_contr; as you can see that I 
didn't use weights, and my data have no missing values.

Do you have any comments on this? Thank you very much.

Xiaokuan

________________________________
From: Gordon K Smyth <smyth at wehi.EDU.AU>
To: Xiaokuan Wei <weixiaokuan at yahoo.com>
Cc: Bioconductor mailing list <bioconductor at stat.math.ethz.ch>
Sent: Sun, August 15, 2010 10:40:58 PM
Subject: [BioC] how does limma calculate the coefficients with using different 
design matrix?

Dear Xiaokuan,

You haven't told us about your data, but it must contain missing values or 
weights, because otherwise the two topTable results you give would have 
been identical.  In the absence of missing values or weights, the results 
would have been as you expected.

With missing values or weights, the first topTable result you give (the 
one with designcontr) is exact whereas the second result is approximate.

The fact that contrast.fit() gives approximate results in the presence of 
missing values or weights is mentioned on the documentation page for 
contrast.fit, and has been discussed a few times on this list, see

https://www.stat.math.ethz.ch/pipermail/bioconductor/2009-December/030947.html

Professor Albyn Jones is trying to pursuade me to implement something 
similar to contrasts.fit() that would be exact in all cases.  The only way 
I could do this would be to add an argument 'contrasts' to lmFit.  I may 
yet do that!

Best wishes
Gordon

> Date: Sat, 14 Aug 2010 12:30:57 -0700 (PDT)
> From: Xiaokuan Wei <weixiaokuan at yahoo.com>
> To: bioconductor <bioconductor at stat.math.ethz.ch>
> Subject: [BioC] how does limma calculate the coefficients with using
>     different design matrix?
>
> Dear list,
>
> I have two design matrixes:
>
> design<-model.matrix(~-1+factor(grp))
> designcontr<-model.matrix(~factor(grp))
>
> then make contrast matrix:
> cont.matrix<-rbind(rep(-1,9),diag(9))
> cont.matrix_contr <- rbind(rep(0,9),diag(9))
>
> after using topTable to get the specific probe values, I found:
> e.g.
> using designcontr and cont.matrix_contr:
> ID Coef1 Coef2 Coef3 Coef4 Coef5 Coef6 Coef7 Coef8 Coef9 AveExpr F P.Value
> adj.P.Val
>
> feature1 -0.07976 2.900216 3.06401 3.210565 0.229553 1.545972 2.433364 2.62326
> 2.719911 13.18788 1133.135 1.60E-15 7.20E-11
>
>
> using design and cont.matrix:
> ID Coef1 Coef2 Coef3 Coef4 Coef5 Coef6 Coef7 Coef8 Coef9 AveExpr F P.Value
> adj.P.Val
>
> feature1 -0.07976 0.229553 1.545972 2.433364 2.62326 2.719911 2.900216 3.06401
> 3.210565 13.18788 1133.135 1.60E-15 7.20E-11
>
>
>
> I try to understand why there is difference between cont.matrix_contr and
> cont.matrx results, since both of them is try to get the difference between 
>each
> grp(grp2 to grp10) and grp1. i.e. grp[i](i=2-10) - grp[1]
>
> even I used the different contrasts, the coef should  be the same. I am
> confused.
>
>
> Thank you for your help.
>
> Regards,
> Xiaokuan