[BioC] limma - interpreting factorial design

Tue Feb 24 17:28:56 CET 2009

Hi John,

if you are just concerned about the numerical values, you can really 
just take the equations and "interpret" them:
Interaction effect
example 1:
(Mu.S-Mu.U)-(WT.S-WT.U)

example 2:
(WT.U-WT.S-Mu.U+Mu.S)/4

Thus, the estimated coefficient in example 2 is a quarter of that in 
example 1. (And the interaction effect should be (Mu.S-Mu.U)-(WT.S-WT.U) )

However the grand mean is already directly estimated. (So there is no 
need to multiply it by four. Again try interpreting the equation given)

But it would be best to have a look at linear models and 
parameterizations first.

e.g.
http://www.statsoft.com/textbook/stglz.html
Otherwise "The R book" has a good section on contrasts.

If you didn't want to pursue that further: use approach 1 in the limma 
guide, as this is usually the easiest one and helps you formulating the 
question you really want.

Potentially, it would be helpful to add a few comments to the 
limmaUsersguide here?

HTH
Björn

john seers (IFR) wrote:
> Hi Bjoern
> 
> Thanks for the reply.
> 
> I am following the example on page 47 exactly, the only difference being using dp as Strain and TNF as Treatment. 
> 
> Here are my factors which gives you which measurements correspond to which treatment:
> 
>> dp
>  [1] Yes Yes Yes No  No  No  Yes Yes Yes No  No  No 
> Levels: No Yes
> 
>> TNF
>  [1] No  No  No  No  No  No  Yes Yes Yes Yes Yes Yes
> Levels: No Yes
> 
>> If you then compare these values with the ones you really want to 
>> extract you can come up with some simple transformations to do so.
> 
> I have not got to that stage yet of what I "really" want to extract. I am trying to understand exactly why these two approaches are equivalent and what the figures actually represent.  
> 
>> In your example you also seem to extract different things from the 
>> treatment-contrast parametrization than from the sum to zero 
>> parametrization.
> 
> In both cases I am extracting the major/primary coefficients and seeing how they relate. So they will be different. I am not extracting anything specific yet. I am having trouble with a description of a coefficient that is described as the "Grand mean" but is 4 times too big for what I think of as a Grand mean.
> 
> The only directly comparable coefficient in these two approaches is the interaction and they are the same in the example. (If multiplied by 4). So, assuming it is correct to multiply by 4 what is the interpretation of the Grand mean coefficient at 18.9249361? If it is not correct to multiply by 4 what is the interpretation of an interaction coefficient that is 4 times smaller than the treatment contrasts coefficient?
> 
> I have run an anova on this gene and with a bit of fiddling I can derive all the figures supplied by limma in both approaches and how they are linked. Except for when they should be 4 times bigger or 4 times smaller.  
> 
> 
> 
> 
> Regards
> 
> John
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  
> ---
> 
> -----Original Message-----
> From: Bjoern Usadel [mailto:usadel at mpimp-golm.mpg.de] 
> Sent: 24 February 2009 12:18
> To: john seers (IFR)
> Subject: Re: [BioC] limma - interpreting factorial design
> 
> Dear John,
> 
> could you please also post
> which of your measurements correspond to which treatment?
> 
> What helps a lot in interpretation is regrouping the terms on page 47 of 
> the user guide e.g. (WT.U-WT.S+Mu.U-Mu.S)/4 and then comparing these to 
> other contrasts or the contrast of interest.
> If you then compare these values with the ones you really want to 
> extract you can come up with some simple transformations to do so.
> 
> In your example you also seem to extract different things from the 
> treatment-contrast parametrization than from the sum to zero 
> parametrization.
> 
> contrast.matrix<-cbind(Intercept=c(1, 0, 0, 0), dp=c(0,1,0,0),
> TNF=c(0,0,1,0), Interaction=c(0,0,0,1))
> 
> If tnf is a factor exactly like in the limma example would most likely 
> not extract the TNF main effect.
> Also the intercept has a different meaning which might cause the 
> differences.
> 
> Best Wishes,
> Björn
> 
> john seers (IFR) wrote:
>> Hello All
>>
>> Can someone help me with unravelling a bit of confusion I have about the
>> limma factorial design?
>>
>> 8.7 Factor Designs (Page 47 approx)  in the user guide has three
>> approaches that are basically equivalent. I am comparing the "sum to
>> zero" and the "treatment contrast" approaches. In the sum to zero
>> approach the comparisons are divided by 4 and this is where my
>> misunderstanding lies.
>>
>> Just looking at the first gene as an example. I have put the expression
>> values below to give an idea of the magnitudes. 
>>
>> With the treatment contrast just extracting the coefficients straight I
>> get the following (code below):
>>
>> eb$coef[1,]
>> #  Intercept          dp         TNF Interaction 
>> # 4.84942088  0.05031631 -0.36610669  0.15883329
>>
>> With the sum to zero the comparisons are divided by 4. So one way to
>> extract the coefficients is below in the code. Using this way (in effect
>> multiplying by 4) I get the following:
>>
>> eb$coef[1,]
>> #         gm          dp         TNF Interaction 
>> # 18.9249361  -0.2594659   0.5733801   0.1588333
>>
>> So here is my problem. The grand mean looks 4 times too large but the
>> interaction matches the interaction from the treatments contrast
>> approach. So I can have one "looking" right but not both. i.e. To
>> multiply by 4 or not to multiply by 4, that is the question. How do I
>> interpret this? What am I missing in my understanding?
>>
>> Thanks for any help
>>
>>
>> Regards
>>
>> John
>>
>>
>> # Sum to zero code
>>
>> fit<-lmFit(eset, design)
>> contrast.matrix<-cbind(gm=c(4,0,0,0), dp=c(0,4,0,0), TNF=c(0,0,4,0),
>> Interaction=c(0,0,0,4)) 
>> #contrast.matrix<-cbind(Interaction=c(0,0,-2,-2)) 
>> fit2<-contrasts.fit(fit, contrast.matrix)
>> eb<-eBayes(fit2)
>>
>>
>> # Treatment contrasts code
>> design<-model.matrix(~dp*TNF) 
>> fit<-lmFit(eset, design)
>> contrast.matrix<-cbind(Intercept=c(1, 0, 0, 0), dp=c(0,1,0,0),
>> TNF=c(0,0,1,0), Interaction=c(0,0,0,1))
>>
>>
>> # Gene 1 expression level
>>
>> exprs1<-exprs[1,]
>> #     4.865401      5.114202      4.719609      4.882969      4.857923 
>> #     4.807370      4.538509      4.759865      4.779017      4.430844 
>> #     4.519123      4.499975
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>> .
>>
> 

-- 
--------------------------------------------------
Björn Usadel, PhD
Max Planck Institute of Molecular Plant Physiology
AG Integrative Carbon Biology
Am Muehlenberg 1
14476 Potsdam-Golm
Tel.: +49 331 5678153
email usadel at mpimp-golm.mpg.de
http://tinyurl.com/IntegrativeCarbonBiology