[BioC] Limma design: paired samples, batch effect
Gordon K Smyth
smyth at wehi.EDU.AU
Sat Apr 4 05:56:50 CEST 2009
Dear John,
You can't expect to be able to estimate more things than your experiment
contains information about. In this case, your batches are entirely
confounded with patients+treatment, and hence you can't estimate
coefficients for them. That is what the message from limma is telling
you.
BTW, you got a diagnostic message and a warning, not an "error" message.
Best wishes
Gordon
> Date: Thu, 2 Apr 2009 15:22:36 +0300
> From: John Burger <clinicalgenomics at gmail.com>
> Subject: [BioC] Limma design: paired samples, batch effect
> To: bioconductor at stat.math.ethz.ch
> Content-Type: text/plain
>
> Hello,
>
> This is yet again a question about limma design matrices. I would like to
> perform analysis on single channel arrays (Affy). The experimental design
> is: paired samples before and after treatment, and I would also like to take
> into account batch effect.
>
> I am having problems with the batch effect, by reading through mails in this
> mailing list, I've gotten the impression that you can include batch to your
> design matrix, but when I try to do that, I get an error message. Here is a
> simulated example of what I am trying to do:
>
> Create example date, patient (5 patients, 2 samples), treatment (A or B),
> and batch (three batches, 1-3)
>
>> patient <- factor(rep(1:5,2))
>> patient
> [1] 1 2 3 4 5 1 2 3 4 5
> Levels: 1 2 3 4 5
>> treatment <- factor(c(rep("A",5), rep("B",5)))
>> treatment
> [1] A A A A A B B B B B
> Levels: A B
>> batch <- factor(c(rep(1,3), rep(2,3), rep(3,4)))
>> batch
> [1] 1 1 1 2 2 2 3 3 3 3
> Levels: 1 2 3
>
> If I would like to perform a basic analysis between the paired samples and
> not taking into account batch effect, I could just do:
>
>> design <- model.matrix(~patient+treatment)
>> design
> (Intercept) patient2 patient3 patient4 patient5 treatmentB
> 1 1 0 0 0 0 0
> 2 1 1 0 0 0 0
> 3 1 0 1 0 0 0
> 4 1 0 0 1 0 0
> 5 1 0 0 0 1 0
> 6 1 0 0 0 0 1
> 7 1 1 0 0 0 1
> 8 1 0 1 0 0 1
> 9 1 0 0 1 0 1
> 10 1 0 0 0 1 1
>
> And then use treatmentB to obtain the differences between treatments A and
> B.
>
> But, how could I include batch effect to the model? My inital thought was:
>
>> design <- model.matrix(~patient+treatment+batch)
>> design
> (Intercept) patient2 patient3 patient4 patient5 treatmentB batch2 batch3
> 1 1 0 0 0 0 0 0 0
> 2 1 1 0 0 0 0 0 0
> 3 1 0 1 0 0 0 0 0
> 4 1 0 0 1 0 0 1 0
> 5 1 0 0 0 1 0 1 0
> 6 1 0 0 0 0 1 1 0
> 7 1 1 0 0 0 1 0 1
> 8 1 0 1 0 0 1 0 1
> 9 1 0 0 1 0 1 0 1
> 10 1 0 0 0 1 1 0 1
>
> But when I do that with actual data, I get the following error message:
>
> Coefficients not estimable: batch2 batch3
> Warning message:
> Partial NA coefficients for 10163 probe(s)
>
> Any ideas what I am doing wrong, and more importantly, what I should do to
> take that batch effect into account?
>
> Thanks,
> John
More information about the Bioconductor
mailing list