[BioC] Yet another nested design in limma
Gordon K Smyth
smyth at wehi.EDU.AU
Wed May 6 02:46:03 CEST 2009
Just a further note. In my previous code, Batch and Line were not
converted to factors. This is ok here, because Batch has 2 levels and
Line has only 2 levels within Phenotype. In general though, you will find
a design like this very much safer to analyse correctly if you re-number
the nested factor 1:2 within each Phenotype. Therefore,
targets <- read.delim("targets.txt")
Line <- 2 - (targets$Line %% 2)
Line <- factor(Line)
Batch <- factor(targets$Batch)
Sex <- factor(targets$Sex)
Phen <- factor(targets$Phen)
design <- model.matrix(~Batch+Sex*(Phen/Line), data=targets)
> colnames(design)
[1] "(Intercept)" "Batch2" "SexM"
[4] "PhenH" "PhenL" "PhenA:Line2"
[7] "PhenH:Line2" "PhenL:Line2" "SexM:PhenH"
[10] "SexM:PhenL" "SexM:PhenA:Line2" "SexM:PhenH:Line2"
[13] "SexM:PhenL:Line2"
etc
Best wishes
Gordon
On Wed, 6 May 2009, Gordon K Smyth wrote:
> Dear Paolo,
>
> As Naomi Altman as already told you, analysing an experiment such as this is
> straightforward with limma. I guess the problem you are having is that you
> are trying to use the limma User's Guide's suggestion of forming a composite
> factor out of the individual factors (called the group means
> parametrization), and you don't know how to define contrasts for interactions
> from this factor. This does become a little more involved for experiments
> with more factors. Can I suggest that you instead make use of the factorial
> formulae in R when you make up the design matrix, then you can probably
> dispense with the contrast step altogether.
>
> You could for example use
>
> targets <- read.delim("targets.txt")
> design <- model.matrix(~Batch+Sex*(Phen/Line), data=targets)
>
> This will produce a design matrix with the following columns.
>
> > colnames(design)
> [1] "(Intercept)" "Batch" "SexM"
> [4] "PhenH" "PhenL" "PhenA:Line"
> [7] "PhenH:Line" "PhenL:Line" "SexM:PhenH"
> [10] "SexM:PhenL" "SexM:PhenA:Line" "SexM:PhenH:Line"
> [13] "SexM:PhenL:Line"
>
> To find genes significant for the sex x line interaction, you can simply use
>
> fit <- lmFit(eset, design)
> fit <- eBayes(fit)
> topTable(fit, coef=9:13)
>
> On the other hand,
>
> topTable(fit, coef=9:10)
>
> is the sex x phen interaction.
>
> Finally, you can add the biolrep as a random effect using the
> duplicateCorrelation() function with block argument, as explained in the
> limma User's Guide, but I am not convinced yet that this is absolutely
> necessary for your experiment.
>
> Can I also suggest that you stroll over to the mathematics department at
> Uppsala and talk to someone interested in bioinformatics and microarray
> analysis, say Professor Tom Britton, and see if you can get ongoing help with
> statistics and design issues.
>
> Best wishes
> Gordon
>
>
>> Date: Mon, 04 May 2009 14:09:14 +0200
>> From: Paolo Innocenti <paolo.innocenti at ebc.uu.se>
>> Subject: Re: [BioC] Yet another nested design in limma
>> Cc: AAA - Bioconductor <bioconductor at stat.math.ethz.ch>
>>
>> Hi all,
>>
>> since I received a few emails in my mailbox by people interested in a
>> solution for this design (or a design similar to this one), but there is
>> apparently no (easy) solution in limma, I was wondering if anyone could
>> suggest a package for differential expression analysis that allows
>> dealing with:
>>
>> nested designs,
>> random effects,
>> multiple factorial designs with more than 2 levels.
>>
>> I identified siggenes, maanova, factDesign that could fit my needs, but
>> I would like to have a comment by someone with more experience before
>> diving into a new package.
>>
>> Best,
>> paolo
>>
>>
>>
>> Paolo Innocenti wrote:
>>> Hi Naomi and list,
>>>
>>> some time ago I asked a question on how to model an experiment in limma.
>>> I think I need some additional help with it as the experiment grew in
>>> complexity. I also added a factor "batch" because the arrays were run in
>>> separate batches, and I think would be good to control for it.
>>> The dataframe with phenotypic informations ("dummy") looks like this:
>>>
>>> >> Phen Line Sex Batch BiolRep
>>> >> File1 H 1 M 1 1
>>> >> File2 H 1 M 1 2
>>> >> File3 H 1 M 2 3
>>> >> File4 H 1 M 2 4
>>> >> File5 H 1 F 1 1
>>> >> File6 H 1 F 1 2
>>> >> File7 H 1 F 2 3
>>> >> File8 H 1 F 2 4
>>> >> File9 H 2 M 1 1
>>> >> File10 H 2 M 1 2
>>> >> File11 H 2 M 2 3
>>> >> File12 H 2 M 2 4
>>> >> File13 H 2 F 1 1
>>> >> File14 H 2 F 1 2
>>> >> File15 H 2 F 2 3
>>> >> File16 H 2 F 2 4
>>> >> File17 L 3 M 1 1
>>> >> File18 L 3 M 1 2
>>> >> File19 L 3 M 2 3
>>> >> File20 L 3 M 2 4
>>> >> File21 L 3 F 1 1
>>> >> File22 L 3 F 1 2
>>> >> File23 L 3 F 2 3
>>> >> File24 L 3 F 2 4
>>> >> File25 L 4 M 1 1
>>> >> File26 L 4 M 1 2
>>> >> File27 L 4 M 2 3
>>> >> File28 L 4 M 2 4
>>> >> File29 L 4 F 1 1
>>> >> File30 L 4 F 1 2
>>> >> File31 L 4 F 2 3
>>> >> File32 L 4 F 2 4
>>> >> File33 A 5 M 1 1
>>> >> File34 A 5 M 1 2
>>> >> File35 A 5 M 2 3
>>> >> File36 A 5 M 2 4
>>> >> File37 A 5 F 1 1
>>> >> File38 A 5 F 1 2
>>> >> File39 A 5 F 2 3
>>> >> File40 A 5 F 2 4
>>> >> File41 A 6 M 1 1
>>> >> File42 A 6 M 1 2
>>> >> File43 A 6 M 2 3
>>> >> File44 A 6 M 2 4
>>> >> File45 A 6 F 1 1
>>> >> File46 A 6 F 1 2
>>> >> File47 A 6 F 2 3
>>> >> File48 A 6 F 2 4
>>>
>>> In total I have
>>> Factor "Phen", with 3 levels
>>> Factor "Line", nested in Phen, 6 levels
>>> Factor "Sex", 2 levels
>>> Factor "Batch", 2 levels
>>>
>>> I am interested in:
>>>
>>> 1) Effect of sex (M vs F)
>>> 2) Interaction between "Sex" and "Line" (or "Sex" and "Phen")
>>>
>>> Now, I can't really come up with a design matrix (not to mention the
>>> contrast matrix).
>>>
>>> Naomi Altman wrote:
>>>> You can design this in limma quite readily. Nesting really just means
>>>> that only a subset of the possible contrasts are of interest. Just
>>>> create the appropriate contrast matrix and you are all set.
>>>
>>> I am not really sure with what you mean here. Should I treat all the
>>> factors as in a factorial design?
>>> I might do something like this:
>>>
>>> phen <- factor(dummy$Phen)
>>> line <- factor(dummy$Line)
>>> sex <- factor(dummy$Sex)
>>> batch <- factor(dummy$Batch)
>>> fact <- factor(paste(sex,phen,line,sep="."))
>>> design <- model.matrix(~ 0 + fact + batch)
>>> colnames(design) <- c(levels(fact), "batch2")
>>> fit <- lmFit(dummy.eset,design)
>>> contrast <- makeContrasts(
>>> sex= (F.H.1 + F.H.2 + F.L.3 + F.L.4 + F.A.5 + F.A.6) - (M.H.1 +
>>> M.H.2 + M.L.3 + M.L.4 + M.A.5 + M.A.6),
>>> levels=design)
>>> fit2 <- contrasts.fit(fit,contrast)
>>> fit2 <- eBayes(fit2)
>>>
>>> In this way I can correctly (I presume) obtain the effect of sex, but
>>> how can I get the interaction term between sex and line?
>>> I presume there is a "easy" way, but I can't see it...
>>>
>>> Thanks,
>>> paolo
>>>
>>>
>>>>
>>>> --Naomi
>>>>
>>>> At 12:08 PM 2/16/2009, Paolo Innocenti wrote:
>>>>> Hi all,
>>>>>
>>>>> I have an experimental design for a Affy experiment that looks like
>>>>> this:
>>>>>
>>>>> Phen Line Sex Biol.Rep.
>>>>> File1 H 1 M 1
>>>>> File2 H 1 M 2
>>>>> File3 H 1 F 1
>>>>> File4 H 1 F 2
>>>>> File5 H 2 M 1
>>>>> File6 H 2 M 2
>>>>> File7 H 2 F 1
>>>>> File8 H 2 F 2
>>>>> File9 L 3 M 1
>>>>> File10 L 3 M 2
>>>>> File11 L 3 F 1
>>>>> File12 L 3 F 2
>>>>> File13 L 4 M 1
>>>>> File14 L 4 M 2
>>>>> File15 L 4 F 1
>>>>> File16 L 4 F 2
>>>>>
>>>>>
>>>>> This appears to be a slightly more complicated situation than the one
>>>>> proposed in the section 8.7 of the limma users guide (p.45) or by
>>>>> Jenny on this post:
>>>>>
>>>>> https://stat.ethz.ch/pipermail/bioconductor/2006-February/011965.html
>>>>>
>>>>> In particular, I am intersted in
>>>>> - Effect of "sex" (M vs F)
>>>>> - Interaction between "sex" and "phenotype ("line" nested)
>>>>> - Effect of "phenotype" in males
>>>>> - Effect of "phenotype" in females
>>>>>
>>>>> Line should be nested in phenotype, because they are random "strains"
>>>>> that happened to end up in phenotype H or L.
>>>>>
>>>>> Can I design this in limma? Is there a source of information about
>>>>> how to handle with this? In particular, can I design a single model
>>>>> matrix and then choose the contrasts I am interested in?
>>>>>
>>>>> Any help is much appreciated,
>>>>> paolo
>>>>>
>>>>>
>>>>> --
>>>>> Paolo Innocenti
>>>>> Department of Animal Ecology, EBC
>>>>> Uppsala University
>>>>> Norbyv?gen 18D
>>>>> 75236 Uppsala, Sweden
>>>>>
>>>>> _______________________________________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>> Naomi S. Altman 814-865-3791 (voice)
>>>> Associate Professor
>>>> Dept. of Statistics 814-863-7114 (fax)
>>>> Penn State University 814-865-1348 (Statistics)
>>>> University Park, PA 16802-2111
>>>>
>>>>
>>>
>>
>> --
>> Paolo Innocenti
>> Department of Animal Ecology, EBC
>> Uppsala University
>> Norbyv?gen 18D
>> 75236 Uppsala, Sweden
>
More information about the Bioconductor
mailing list