[BioC] Re: Designing matrix with limma
Gordon Smyth
smyth at wehi.edu.au
Mon Aug 25 20:23:51 MEST 2003
At 04:46 AM 23/08/2003, Sek Won Kong, M.D wrote:
>Dear Gordon
>
>I am sorry if I made any incovinience. I have a question about design matrix
>in limma.
>We've designed experiment and completed. But it's pretty tough to analysis.
>Design is 2 x 2 x 2 factorial design and one more factor is 2 different
>scanner settings were used randomly. Actual experiment look like this.
>
> | Experiment A | Experiment B
>-------------------------------------------------------------
> | Control | Treatment | Control | Treatment
>-------------------------------------------------------------
>TIme A |
>TIme B |
>
>Each cell has biological 5 replicates of affy array. It's also possible to
>use just 2 x 2 factorial ANOVA and then compare results, but I think it's
>better to start with a single model in terms of parsimony and also two
>experiments are closely related in biological sense.
This raises a lot of issues of which probably the easiest is how to create
a design matrix in limma. Let's consider the design matrix first.
You have 4 factors each with 2 levels, i.e., a 2^4 design, including the
scanner settings. Do you know how to analyse ordinary factorial experiments
with univariate data using R? If you do, then the extension to microarrays
is straightforward in principle although the interpretation of the
parameters may be difficult. You might analyse an ordinary experiment using
in R using a call to 'lm' such as
lm( y ~ (facA+facB+facC+facD)^4 )
where facA, facB, facC and facD are your factors. (I will assume for this
email that you know how to create factors in R.) To use limma with
microarray data, you can simply set
design <- model.matrix( ~(facA+facB+facC+facD)^4 )
fit <- lmFit( eset, design)
i.e., you can use function 'model.matrix' to extract the design matrix from
the linear model formula. (I have assumed you have the development version
of limma so that you can use lmFit.)
The difficulty is in interpreting the estimated coefficients from your
model fit. How will you intepret three or four way interaction terms?
Perhaps you would be better testing for a difference between the scanners
and then analysing the other three factors separately. Perhaps it is the
control vs treatment and time A vs time B comparisons which are really of
interest to you, i.e., it is the 2x2 factorial with treatment and time
which is really of interest to you. In that case you have a real chance of
associating meaningful biological interpretations to the estimated
coefficients. You need to think carefully about what questions you want to
answer from your experiment and then tailor the analysis accordingly.
It would probably be a good idea to consult a statistician at Harvard and
to help work out an analysis strategy.
Regards
Gordon
>Thank you for the helps in advance.
>
>Sek Won Kong with Bests.
More information about the Bioconductor
mailing list