[BioC] limma and 2way anova
Gordon Smyth
smyth at wehi.EDU.AU
Fri Aug 18 07:08:46 CEST 2006
Hi Lisa and James,
You are both correct I think. The contrast given by Lisa is the
KOvsWT effect under the contr.sum (sum to zero) parametrization while
the contrast given by James is the KOvsWT effect under the
contr.treat (treatment) parametrization. See ?contr.sum. The sum to
zero parametrization is the classical parametrization used by
statistics textbooks for factorial anova. The treatment
parametrization is the default used by R for linear models and anova.
The treatment parametrization was popularised by computer software
many years ago by programs such as GLIM.
The moral is that there is no unique definition of main effect in a
two-way anova. Neither parametrization is right or wrong. The sum to
zero parametrization gives you the genotype effect averaged over the
two cell types. The treatment parametrization gives you the genotype
effect for cell type 1 only. The lack of a unique definition is the
reason why I in effect force limma users to specify the contrasts explicitly.
I have tried to explain the different parametrizations in Section 8.7
"Factorial Designs" of the limma User's Guide. Please have a look at
this. I'd be pleased for any feedback on how helpful it is.
Best wishes
Gordon
>Date: Wed, 16 Aug 2006 11:31:32 -0400
>From: "James W. MacDonald" <jmacdon at med.umich.edu>
>Subject: Re: [BioC] limma and 2way anova
>To: Lisa Luo <lisaluo_bioc at yahoo.com>
>Cc: bioconductor at stat.math.ethz.ch
>
>Hi Lisa,
>
>Lisa Luo wrote:
> > Hi List, I have a questions regarding Limma and 2way ANOVA. I have a
> > data set containing 2 cell lines and a gene knockout. So in the
> > design file, I have cell1.KO, cell1.WT, cell2.KO and cell2.WT. I
> > want to get the differentially expressed genes between KO and WT. Is
> > the contrast (0.5*(cell1.KO-cell1.WT+cell2.KO-cell2.WT)) right? Is
> > this the same as looking knockout effect in 2way anova? When I take a
> > look at the heatmap, the genes identified seemed to be differentially
> > expressed in either one of the comparison?
>
>This is not the same as a conventional main effect in a two-way ANOVA,
>but it _does_ measure the difference between KO and WT. Just not the way
>that you might think.
>
>Note that the statistic you are constructing has the average difference
>between KO and WT in the numerator, and a moderated measure of the
>standard error associated with each coefficient in the denominator.
>
>What this means is you will select those genes where the average
>difference between KO and WT is 'large' and the variability of each
>group (cell1.KO, cell2.KO, cell1.WT, cell2.WT) is low. Because of this,
>you can get a significant contrast if e.g., cell1.KO - cell1.WT is large
>(but cell2.KO - cell2.WT is very small), if the variance estimate for
>each term is small, which is what I think you are seeing.
>
>If you are really looking for a standard main effect (i.e., KO vs WT
>ignoring cell type) then you can do one of two things. First, you can
>set your parameterization up so you are explicitly fitting a model with
>main effects, or you can fit a simpler model where you are just doing a
>t-test comparing KO vs WT and pooling the cell types.
>
>As an example, let's say you have two replicates of each sample, and the
>replicates look like this:
>
> > rep(c("cell1.KO","cell1.WT","cell2.KO","cell2.WT"), each=2)
>[1] "cell1.KO" "cell1.KO" "cell1.WT" "cell1.WT" "cell2.KO" "cell2.KO"
>"cell2.WT" "cell2.WT"
>
>Now you can set up your model like this:
>
> > KO <- factor(rep(1:2, each = 2, times = 2))
> > KO
>[1] 1 1 2 2 1 1 2 2
>Levels: 1 2
> > CELL <- factor(rep(1:2, each = 4))
> > CELL
>[1] 1 1 1 1 2 2 2 2
>Levels: 1 2
> > design <- model.matrix(~KO + CELL)
> > design
> (Intercept) KO2 CELL2
>1 1 0 0
>2 1 0 0
>3 1 1 0
>4 1 1 0
>5 1 0 1
>6 1 0 1
>7 1 1 1
>8 1 1 1
>attr(,"assign")
>[1] 0 1 2
>attr(,"contrasts")
>attr(,"contrasts")$KO
>[1] "contr.treatment"
>
>attr(,"contrasts")$CELL
>[1] "contr.treatment"
>
>Now your second coefficient measures the difference between KO and WT
>while ignoring cell type, just like in a conventional two-way ANOVA.
>
>HTH,
>
>Jim
>
>
> >
> > Thanks,
> >
> > Lisa
> >
> > ---------------------------------
> >
> >
> > [[alternative HTML version deleted]]
> >
> > _______________________________________________ Bioconductor mailing
> > list Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> > archives:
> > http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>--
>James W. MacDonald, M.S.
>Biostatistician
>Affymetrix and cDNA Microarray Core
>University of Michigan Cancer Center
>1500 E. Medical Center Drive
>7410 CCGC
>Ann Arbor MI 48109
>734-647-5623
More information about the Bioconductor
mailing list