[BioC] limma design matrix

Fri Feb 27 17:42:46 MET 2004

Hi

First off, let me say that i think limma is a quite brilliant package and I use it a lot.  However, one of the biggest obstructions to using limma for the lay biologist is an inability to understand the design matrix.  Although there is a lot of documentation, showing the design matrix for a number of example problems, there is no discussion as to how that design matrix was constructed i.e. the logical thought processes that went into it.

For the two-sample experiment given in the UserGuide, I understand that there must be one row per array in my design matrix and the columns represent the coefficients I want to calculate.  I represent the differences in the factors with 1's and 0's.  Great, this is pretty similar to how I do it for aov().

But then, all of a sudden, for the factorial experiment the design matrix not only has -1's in there, but also a column for the interactions.  How do I decide which array/factor combination gets a 1, a 0 or a -1?  

Let me put this in perspective.  I have a 3 factor experiment where the factors are animal, infected/uninfected and time.  All samples were hybridised against a common reference.  For analysis of variance, all I do is set up a data.frame that looks like this for each gene:

  data c b t
1  2.9 1 1 1
2  2.7 1 0 2
3  2.8 1 1 1
4  3.0 1 0 2
5 -3.0 0 1 1
6 -3.5 0 0 2
7 -4.0 0 1 1
8 -5.0 0 0 2

where data is my data, and c, b and t are my factors, and then feed in something like:

(aov.aov <- aov(data ~ c*b*t, aov.data))

and I get F-statistics for c, b and t and all possible interactions.  

Because of the limitations of analysis of variance for my microarray data, I would like to use limma.  Is there any *more* documentation I can look at that will tell me the steps to take to work out what my limma design matrix will look like?

Kind regards

Michael