[BioC] doing paired t-test amongst several groups

Fri Feb 9 01:01:18 CET 2007

Hello Everyone,

I am wondering if anyone has scaled a paired t-test to do multiple 
pairwise comparisons and can enlighten me in how to interpret the 
outcome. I read the limma guide back and forth but seem to be missing on 
understanding a few things.

Essentially I am doing a paired t-test, but have 3 treatments and wish 
to make pairwise comparisons of all combinations.

I have single channel data (Illumina) that I imported using 
BeadExplorer, this creates an exprSet. Following that I RMA-bg-corrected 
and then normalized using Quantile normalization from the BeadExplorer 
package, which essentially invokes limma Quantile normalization. As a 
result of this I had an exprSet of normalized values which I then log2 
transformed.

So my experimental design is as follows, 5 patients that were biopsied 
(OB1 to OB5) and their biopsy split into 3 cultures of cells that 
underwent each a different treatment (surfaces A, B, C). Therefore I 
have 3 treatments, each with 5 replicates but they are of the same 
origin, which to my logic seems like I should analyse as paired samples.

My challenge was to scale the paired t-test to 3 sets of comparisons.

So first I read a targets file that specifies all the pairs and treatments

 > targets <- readTargets("samples.txt")
 > targets
       FileName Patient Surface
1  1519138023_A     OB1     A
2  1488802050_A     OB1     B
3  1488802050_D     OB1     C
4  1519138023_B     OB2     A
5  1488802050_B     OB2     B
6  1488802050_E     OB2     C
7  1519138023_C     OB3     A
8  1488802050_C     OB3     B
9  1488802050_F     OB3     C
10 1519138023_D     OB4     A
11 1519138023_E     OB4     B
12 1519138023_F     OB4     C
13 1519138034_A     OB5     A
14 1519138034_B     OB5     B
15 1519138034_C     OB5     C

Then make the design matrix
 > Patients <- factor(targets$Patient)
 > Surfaces <- factor(targets$Surface, levels=c("A", "B", "C") )
 > paired_design <- model.matrix(~Patients+Surfaces)

And then fit a linear model and do eBayes
 > fit_paired_RMAbg_Qnorm <- lmFit(data_log2_RMAbg_Qnorm, paired_design)
 > fit2_paired_RMAbg_Qnorm <- eBayes(fit_paired_RMAbg_Qnorm)

 > topTable(fit2_paired_RMAbg_Qnorm, number=2)
                 ID X.Intercept.   PatientsOB2 PatientsOB3 PatientsOB4
13720 GI_34304116-S     15.29244  1.431159e-15   0.1152188  0.14177094
11757 GI_31543813-S     15.14338 -1.090994e-01   0.1038085  0.08840763

      PatientsOB5 SurfacesSLA SurfacesSLAa  AveExpr        F
13720 -0.03689951 0.006326441   0.01046853 15.34205 30967.96
11757  0.01106040 0.080210742  -0.06714165 15.16657 29657.53

           P.Value    adj.P.Val
13720 1.549603e-24 1.816823e-20
11757 2.007728e-24 1.816823e-20

My Questions are:
I am a bit confused by the fact that in the resulting table (shown by 
topTable) I am getting a column for the intercept of surface A with all 
patients as well as other surfaces. What do the values under patients 
mean? Does the fact that they are being considered reduces the power of 
the comparison to the other surfaces?

As I am not interested in the differential expression amongst patients, 
how do I avoid these being considered?

How can I know about the differences amongst surfaces B and C?

Do I need to or can I make a contrast matrix to specify which are the 
comparisons I want to get information for? (only surfaces, and not 
amongst patients)

If I can make a contrast matrix, can you give me an example of how to do 
it with 3 treatments?

Many Thanks!

Milena