[BioC] Yet another nested design in limma
Gordon K Smyth
smyth at wehi.EDU.AU
Wed May 6 02:05:14 CEST 2009
Dear Paolo,
As Naomi Altman as already told you, analysing an experiment such as this
is straightforward with limma. I guess the problem you are having is that
you are trying to use the limma User's Guide's suggestion of forming a
composite factor out of the individual factors (called the group means
parametrization), and you don't know how to define contrasts for
interactions from this factor. This does become a little more involved
for experiments with more factors. Can I suggest that you instead make
use of the factorial formulae in R when you make up the design matrix,
then you can probably dispense with the contrast step altogether.
You could for example use
targets <- read.delim("targets.txt")
design <- model.matrix(~Batch+Sex*(Phen/Line), data=targets)
This will produce a design matrix with the following columns.
> colnames(design)
[1] "(Intercept)" "Batch" "SexM"
[4] "PhenH" "PhenL" "PhenA:Line"
[7] "PhenH:Line" "PhenL:Line" "SexM:PhenH"
[10] "SexM:PhenL" "SexM:PhenA:Line" "SexM:PhenH:Line"
[13] "SexM:PhenL:Line"
To find genes significant for the sex x line interaction, you can simply
use
fit <- lmFit(eset, design)
fit <- eBayes(fit)
topTable(fit, coef=9:13)
On the other hand,
topTable(fit, coef=9:10)
is the sex x phen interaction.
Finally, you can add the biolrep as a random effect using the
duplicateCorrelation() function with block argument, as explained in the
limma User's Guide, but I am not convinced yet that this is absolutely
necessary for your experiment.
Can I also suggest that you stroll over to the mathematics department at
Uppsala and talk to someone interested in bioinformatics and microarray
analysis, say Professor Tom Britton, and see if you can get ongoing help
with statistics and design issues.
Best wishes
Gordon
> Date: Mon, 04 May 2009 14:09:14 +0200
> From: Paolo Innocenti <paolo.innocenti at ebc.uu.se>
> Subject: Re: [BioC] Yet another nested design in limma
> Cc: AAA - Bioconductor <bioconductor at stat.math.ethz.ch>
>
> Hi all,
>
> since I received a few emails in my mailbox by people interested in a
> solution for this design (or a design similar to this one), but there is
> apparently no (easy) solution in limma, I was wondering if anyone could
> suggest a package for differential expression analysis that allows
> dealing with:
>
> nested designs,
> random effects,
> multiple factorial designs with more than 2 levels.
>
> I identified siggenes, maanova, factDesign that could fit my needs, but
> I would like to have a comment by someone with more experience before
> diving into a new package.
>
> Best,
> paolo
>
>
>
> Paolo Innocenti wrote:
>> Hi Naomi and list,
>>
>> some time ago I asked a question on how to model an experiment in limma.
>> I think I need some additional help with it as the experiment grew in
>> complexity. I also added a factor "batch" because the arrays were run in
>> separate batches, and I think would be good to control for it.
>> The dataframe with phenotypic informations ("dummy") looks like this:
>>
>> >> Phen Line Sex Batch BiolRep
>> >> File1 H 1 M 1 1
>> >> File2 H 1 M 1 2
>> >> File3 H 1 M 2 3
>> >> File4 H 1 M 2 4
>> >> File5 H 1 F 1 1
>> >> File6 H 1 F 1 2
>> >> File7 H 1 F 2 3
>> >> File8 H 1 F 2 4
>> >> File9 H 2 M 1 1
>> >> File10 H 2 M 1 2
>> >> File11 H 2 M 2 3
>> >> File12 H 2 M 2 4
>> >> File13 H 2 F 1 1
>> >> File14 H 2 F 1 2
>> >> File15 H 2 F 2 3
>> >> File16 H 2 F 2 4
>> >> File17 L 3 M 1 1
>> >> File18 L 3 M 1 2
>> >> File19 L 3 M 2 3
>> >> File20 L 3 M 2 4
>> >> File21 L 3 F 1 1
>> >> File22 L 3 F 1 2
>> >> File23 L 3 F 2 3
>> >> File24 L 3 F 2 4
>> >> File25 L 4 M 1 1
>> >> File26 L 4 M 1 2
>> >> File27 L 4 M 2 3
>> >> File28 L 4 M 2 4
>> >> File29 L 4 F 1 1
>> >> File30 L 4 F 1 2
>> >> File31 L 4 F 2 3
>> >> File32 L 4 F 2 4
>> >> File33 A 5 M 1 1
>> >> File34 A 5 M 1 2
>> >> File35 A 5 M 2 3
>> >> File36 A 5 M 2 4
>> >> File37 A 5 F 1 1
>> >> File38 A 5 F 1 2
>> >> File39 A 5 F 2 3
>> >> File40 A 5 F 2 4
>> >> File41 A 6 M 1 1
>> >> File42 A 6 M 1 2
>> >> File43 A 6 M 2 3
>> >> File44 A 6 M 2 4
>> >> File45 A 6 F 1 1
>> >> File46 A 6 F 1 2
>> >> File47 A 6 F 2 3
>> >> File48 A 6 F 2 4
>>
>> In total I have
>> Factor "Phen", with 3 levels
>> Factor "Line", nested in Phen, 6 levels
>> Factor "Sex", 2 levels
>> Factor "Batch", 2 levels
>>
>> I am interested in:
>>
>> 1) Effect of sex (M vs F)
>> 2) Interaction between "Sex" and "Line" (or "Sex" and "Phen")
>>
>> Now, I can't really come up with a design matrix (not to mention the
>> contrast matrix).
>>
>> Naomi Altman wrote:
>>> You can design this in limma quite readily. Nesting really just means
>>> that only a subset of the possible contrasts are of interest. Just
>>> create the appropriate contrast matrix and you are all set.
>>
>> I am not really sure with what you mean here. Should I treat all the
>> factors as in a factorial design?
>> I might do something like this:
>>
>> phen <- factor(dummy$Phen)
>> line <- factor(dummy$Line)
>> sex <- factor(dummy$Sex)
>> batch <- factor(dummy$Batch)
>> fact <- factor(paste(sex,phen,line,sep="."))
>> design <- model.matrix(~ 0 + fact + batch)
>> colnames(design) <- c(levels(fact), "batch2")
>> fit <- lmFit(dummy.eset,design)
>> contrast <- makeContrasts(
>> sex= (F.H.1 + F.H.2 + F.L.3 + F.L.4 + F.A.5 + F.A.6) - (M.H.1 +
>> M.H.2 + M.L.3 + M.L.4 + M.A.5 + M.A.6),
>> levels=design)
>> fit2 <- contrasts.fit(fit,contrast)
>> fit2 <- eBayes(fit2)
>>
>> In this way I can correctly (I presume) obtain the effect of sex, but
>> how can I get the interaction term between sex and line?
>> I presume there is a "easy" way, but I can't see it...
>>
>> Thanks,
>> paolo
>>
>>
>>>
>>> --Naomi
>>>
>>> At 12:08 PM 2/16/2009, Paolo Innocenti wrote:
>>>> Hi all,
>>>>
>>>> I have an experimental design for a Affy experiment that looks like
>>>> this:
>>>>
>>>> Phen Line Sex Biol.Rep.
>>>> File1 H 1 M 1
>>>> File2 H 1 M 2
>>>> File3 H 1 F 1
>>>> File4 H 1 F 2
>>>> File5 H 2 M 1
>>>> File6 H 2 M 2
>>>> File7 H 2 F 1
>>>> File8 H 2 F 2
>>>> File9 L 3 M 1
>>>> File10 L 3 M 2
>>>> File11 L 3 F 1
>>>> File12 L 3 F 2
>>>> File13 L 4 M 1
>>>> File14 L 4 M 2
>>>> File15 L 4 F 1
>>>> File16 L 4 F 2
>>>>
>>>>
>>>> This appears to be a slightly more complicated situation than the one
>>>> proposed in the section 8.7 of the limma users guide (p.45) or by
>>>> Jenny on this post:
>>>>
>>>> https://stat.ethz.ch/pipermail/bioconductor/2006-February/011965.html
>>>>
>>>> In particular, I am intersted in
>>>> - Effect of "sex" (M vs F)
>>>> - Interaction between "sex" and "phenotype ("line" nested)
>>>> - Effect of "phenotype" in males
>>>> - Effect of "phenotype" in females
>>>>
>>>> Line should be nested in phenotype, because they are random "strains"
>>>> that happened to end up in phenotype H or L.
>>>>
>>>> Can I design this in limma? Is there a source of information about
>>>> how to handle with this? In particular, can I design a single model
>>>> matrix and then choose the contrasts I am interested in?
>>>>
>>>> Any help is much appreciated,
>>>> paolo
>>>>
>>>>
>>>> --
>>>> Paolo Innocenti
>>>> Department of Animal Ecology, EBC
>>>> Uppsala University
>>>> Norbyv?gen 18D
>>>> 75236 Uppsala, Sweden
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>
>>> Naomi S. Altman 814-865-3791 (voice)
>>> Associate Professor
>>> Dept. of Statistics 814-863-7114 (fax)
>>> Penn State University 814-865-1348 (Statistics)
>>> University Park, PA 16802-2111
>>>
>>>
>>
>
> --
> Paolo Innocenti
> Department of Animal Ecology, EBC
> Uppsala University
> Norbyv?gen 18D
> 75236 Uppsala, Sweden
More information about the Bioconductor
mailing list