[BioC] paired microarray samples

Chris Fjell cfjell at interchange.ubc.ca
Fri Oct 16 21:05:36 CEST 2009

I've a question on how to properly deal with paired samples in limma.

For example, I've got an experiment with 3 blood donors (D1, D2, D3) and
4 treatments (T1, T2, T3, T4) of the blood cells. One single-channel
array is done for each donor-treatment (e.g. D1 cells treated with T2).

In the past I've treated these as simple biological replicates, so the
code flow is like:

treatments=pData(BSData.quantile)$Sample_Group # T1, T2, T3, T4,  from
Illumina beadarray data
fit = lmFit(exprs(BSData.quantile), design)
cont.matrix = makeContrasts(T1vT2 = T1 - T2, T3vT1 = T3 - T1, levels =
fit = contrasts.fit(fit, cont.matrix)
ebFit = eBayes(fit)   

for( comp in colnames( cont.matrix ) ) {    
    tt = topTable(ebFit, coef = comp)
    [write results to file...]

But donors are responding quite differently so I don't want to simple
pool all the treatments, I want to consider response of each donor, i.e.
"Paired Samples".

I found this snippet on the BioC newsgroup at
Calculate the logFC ratios manually per donor, then pass it to lmFit() -
seems like the standard paired t-test.

For my example, for the comparison of T1 to T2
exM = exprs(BSData.quantile)
diffD1 = exM[, which( donors == "D1" & samples == "T1" )] - exM[, which(
donors == "D1" & samples == "T2" )]
diffD2 = exM[, which( donors == "D2" & samples == "T1" )] - exM[, which(
donors == "D2" & samples == "T2" )]
diffD3 = exM[, which( donors == "D3" & samples == "T1" )] - exM[, which(
donors == "D3" & samples == "T2" )]

paired_ex = cbind( diffD1, diffD2, diffD3)
fit = lmFit(paired_ex)          

The example cited above also uses "block = origin, correlation =
dupcor$cons" parameters to lmFit() but this is for technical replicates (?)
This works but I get dramatically worse p-values (due I assume to using
half the  number of "samples": 3 ratios tested against 0 versus testing
two groups of 3).

But the limma user's guide (Oct 22, 2008 version, Chapter 8, p.40) gives
another example, using model.matrix() not makeContrasts():

> SibShip <- factor(targets$SibShip)
> Treat <- factor(targets$Treatment, levels=c("C","T"))
> design <- model.matrix(~SibShip+Treat)
> fit <- lmFit(eset, design)
> fit <- eBayes(fit)
> topTable(fit, coef="TreatT")

I do the similar thing for my data ...

design_paired_treatments = model.matrix( ~donors+treatments )
fit_ps = lmFit( exprs(BSData.quantile),design_paired_treatments)
fit_ps = eBayes( fit_ps )  

and look at the columns of my design matrix and it's like
colnames( design_paired_treatments )
[1] "(Intercept)"    "donorsD1"       "donorsD2"       "treatmentsT1"
[5] "treatmentsT2"  "treatmentsT3"

Donor D3 and treatment T4 don't appear in the design.

and I get some results...
tt = topTable( fit_ps, coef="donorsD1")
          ID logFC AveExpr      t      P.Value    adj.P.Val    B
 6370315 -6.58    8.74      -119.4  4.26e-17   2.08e-12    17.6
 7160474  3.59    8.77       56.5     7.43e-14   1.37e-09    16.6
 5260484 -5.69    8.71      -55.8     8.39e-14  1.37e-09    16.6

This is a regression model, I think...  I can see why the two terms
(donor D3 and treatment T4) disappeared - they have to be restated in
terms of the rest, if I remember right for doing ANOVA with indicator

Is there a recommended way to do this with limma? Or another package?

Thanks for any advice.


More information about the Bioconductor mailing list