[BioC] LIMMA experimental design affecting modelmatrix

Thu Dec 10 23:02:03 CET 2009

Hi Nadia,

Aubin-Horth Nadia wrote:
> Hi James
> 
> Thank you for your response.
> 
> We do have a dye-swap, which I represented with the "=" symbol between 
> my biological replicates. Sorry for the confusion.
> 
> Detailed experimental design (first sample is cy3, second sample is cy5):
> slide 1: A1-D1
> slide 2: D1-A1
> slide 3: A2-D2
> slide 4: D2-A2
> slide 5: A3-D3
> slide 6: D3-A3
> slide 7: A4-D4
> slide 8: D4-A4
> slide 9: A5-D5
> slide 10: D5-A5
> slide 11: A6-D6
> slide 12: D6-A6
> 
> The comparison of interest is the average difference in expression 
> between group A and group D, taking into account technical and 
> biological replicates. A1 to A6 are different individuals of the group 
> A, and D1 to D6 are different individuals of the group D, and these 
> individuals (fish actually) are wild caught and may vary in gene 
> expression even within the group.
> 
> In your message you say:
> 
>> All you can do is compare the two sample types. Since
>> the comparison is intrinsic to the data, all you want to do is compute
>> the mean of the ratios (or log ratios, actually) and then test to see if
>> the mean is different from zero.
>>
>> In this very simple case, you don't have to create a design matrix, as
>> limma will produce one for you. So you just do:
>>
>> fit <- lmFit(MAptip.nba.scale)
>> fit2 <- eBayes(fit)
>> topTable(fit2)
> 
> Reading the LIMMA User guide (section on technical replication, p.36), I 
> was under the impression that since I wanted to keep information for 
> each of the 6 biological replicates within a group in the test, as well 
> as include technical replicates which are not independent, I needed to 
> fit a linear model and then use a contrast. This is why we wanted to use 
> one of the individuals as a reference in the design matrix and carry on 
> with fitting the linear model and creating a contrast matrix that 
> represents the average difference between A and D. When I test 
> is.fullrank(design), I get "FALSE" as an answer. I suspect that our 
> experimental design is the cause, but I may be wrong.

Yes, it is your design matrix. You don't show exactly what it looks 
like, but if it isn't of full rank it won't work regardless.

I think you want to emulate the analysis on p37 (at least that's the 
page in my version of the Limma User's Guide). You have dye swaps and 
unlike the analysis I think you are following (which doesn't have equal 
technical replication), you have equal technical replication so can use 
duplicateCorrelation. The relevant section of the user's guide is this:

"If the technical replicates were in dye-swap pairs as
FileName Cy3 Cy5
File1 wt1 mu1
File2 mu1 wt1
File3 wt2 mu2
File4 mu2 wt2
then one might use
 > design <- c(1, -1, 1, -1)
 > corfit <- duplicateCorrelation(MA, design, ndups = 1, block = biolrep)
 > fit <- lmFit(MA, design, block = biolrep, cor = corfit$consensus)
 > fit <- eBayes(fit)
 > topTable(fit, adjust = "BH")"

Where biolrep in this case is c(1,1,2,2).

If you are concerned about dye-bias, you could augment the design matrix 
by adding a vector of all 1s as the first column.

Best,

Jim

> 
> Thank you
> 
> Nadia Aubin-Horth
> Laval University
> 
> On Dec 10, 2009, at 2:05 PM, James W. MacDonald wrote:
> 
>> Hi Nadia,
>>
>> Aubin-Horth Nadia wrote:
>>> We are analysing a cDNA microarray dataset in LIMMA with the following
>>> design and we run into "Coefficients not estimable" comments.
>>>
>>> R = 2.9.0
>>> limma=2.18
>>>
>>> We have two groups, "A" and "D" with 6 biological replicates each
>>> (indicated in the targets file). We are interested in significant gene
>>> expression differences between A and D on average. A given biological
>>> replicate of "A" was compared to a biological replicate of "D", with a
>>> dye-swap.
>>> A1=D1
>>> A2=D2
>>> A3=D3
>>> A4=D4
>>> A5=D5
>>> A6=D6
>>
>> I don't see a dye-swap here.
>>
>>>
>>> Therefore, we have six mini-experiments that are not connected one to
>>> the other.
>>>
>>> After normalisation, we use teh following design with the goal of doing
>>> a contrast as shown below
>>>
>>> design <- modelMatrix(targets, ref="D1")
>>> design <- cbind(Dye=1, design)
>>
>> Without a dye-swap (and you don't indicate one above), you cannot
>> estimate dye bias. All you can do is compare the two sample types. Since
>> the comparison is intrinsic to the data, all you want to do is compute
>> the mean of the ratios (or log ratios, actually) and then test to see if
>> the mean is different from zero.
>>
>> In this very simple case, you don't have to create a design matrix, as
>> limma will produce one for you. So you just do:
>>
>> fit <- lmFit(MAptip.nba.scale)
>> fit2 <- eBayes(fit)
>> topTable(fit2)
>>
>> FYI, what you are doing is to treat each sample as if it isn't
>> replicated, which is why you are running into problems.
>>
>> Best,
>>
>> Jim
>>
>>
>>>
>>> fit<-lmFit(MAptip.nba.scale,design)
>>>
>>> here we get:
>>> Coefficients not estimable: D5 A4 D2 D6 D3
>>>
>>> And we checked with
>>>> is.fullrank(design)
>>> and we get:
>>> [1] FALSE
>>>
>>> Our question is, is our experimental design (non loop, non reference
>>> design) with samples not being directly compared on a microarray causing
>>> these non estimable coefficients? If so, is there a way that we can keep
>>> the information on biological replicates and still use LIMMA?
>>>
>>> This is the contrast we were planning to use (which of course does not
>>> work)
>>> cont.matrix <- makeContrasts(AvsD = (A1 + A2 + A3 + A4 + A5 + A6
>>> - D2 - D3 - D4 - D5 - D6)/6, levels=design)
>>> fit2 <- contrasts.fit(fit, cont.matrix)
>>> Error in contrasts.fit(fit, cont.matrix) :
>>>  trying to take contrast of non-estimable coefficient
>>> In addition: Warning message:
>>> In any(contrasts[-est, ]) : coercing argument of type 'double' to 
>>> logical
>>> fit3 <- eBayes(fit2)
>>>
>>>
>>> Thank you
>>>
>>> Nadia Aubin-Horth
>>> Assistant professor
>>> Biology Department
>>> Institut de Biologie Intégrative et des Systèmes
>>> Université Laval
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> -- 
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>> **********************************************************
>> Electronic Mail is not secure, may not be read every day, and should 
>> not be used for urgent or sensitive issues
>>
> 

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues