[BioC] doing paired t-test amongst several groups
Milena Gongora
m.gongora at imb.uq.edu.au
Fri Feb 9 01:01:18 CET 2007
Hello Everyone,
I am wondering if anyone has scaled a paired t-test to do multiple
pairwise comparisons and can enlighten me in how to interpret the
outcome. I read the limma guide back and forth but seem to be missing on
understanding a few things.
Essentially I am doing a paired t-test, but have 3 treatments and wish
to make pairwise comparisons of all combinations.
I have single channel data (Illumina) that I imported using
BeadExplorer, this creates an exprSet. Following that I RMA-bg-corrected
and then normalized using Quantile normalization from the BeadExplorer
package, which essentially invokes limma Quantile normalization. As a
result of this I had an exprSet of normalized values which I then log2
transformed.
So my experimental design is as follows, 5 patients that were biopsied
(OB1 to OB5) and their biopsy split into 3 cultures of cells that
underwent each a different treatment (surfaces A, B, C). Therefore I
have 3 treatments, each with 5 replicates but they are of the same
origin, which to my logic seems like I should analyse as paired samples.
My challenge was to scale the paired t-test to 3 sets of comparisons.
So first I read a targets file that specifies all the pairs and treatments
> targets <- readTargets("samples.txt")
> targets
FileName Patient Surface
1 1519138023_A OB1 A
2 1488802050_A OB1 B
3 1488802050_D OB1 C
4 1519138023_B OB2 A
5 1488802050_B OB2 B
6 1488802050_E OB2 C
7 1519138023_C OB3 A
8 1488802050_C OB3 B
9 1488802050_F OB3 C
10 1519138023_D OB4 A
11 1519138023_E OB4 B
12 1519138023_F OB4 C
13 1519138034_A OB5 A
14 1519138034_B OB5 B
15 1519138034_C OB5 C
Then make the design matrix
> Patients <- factor(targets$Patient)
> Surfaces <- factor(targets$Surface, levels=c("A", "B", "C") )
> paired_design <- model.matrix(~Patients+Surfaces)
And then fit a linear model and do eBayes
> fit_paired_RMAbg_Qnorm <- lmFit(data_log2_RMAbg_Qnorm, paired_design)
> fit2_paired_RMAbg_Qnorm <- eBayes(fit_paired_RMAbg_Qnorm)
> topTable(fit2_paired_RMAbg_Qnorm, number=2)
ID X.Intercept. PatientsOB2 PatientsOB3 PatientsOB4
13720 GI_34304116-S 15.29244 1.431159e-15 0.1152188 0.14177094
11757 GI_31543813-S 15.14338 -1.090994e-01 0.1038085 0.08840763
PatientsOB5 SurfacesSLA SurfacesSLAa AveExpr F
13720 -0.03689951 0.006326441 0.01046853 15.34205 30967.96
11757 0.01106040 0.080210742 -0.06714165 15.16657 29657.53
P.Value adj.P.Val
13720 1.549603e-24 1.816823e-20
11757 2.007728e-24 1.816823e-20
My Questions are:
I am a bit confused by the fact that in the resulting table (shown by
topTable) I am getting a column for the intercept of surface A with all
patients as well as other surfaces. What do the values under patients
mean? Does the fact that they are being considered reduces the power of
the comparison to the other surfaces?
As I am not interested in the differential expression amongst patients,
how do I avoid these being considered?
How can I know about the differences amongst surfaces B and C?
Do I need to or can I make a contrast matrix to specify which are the
comparisons I want to get information for? (only surfaces, and not
amongst patients)
If I can make a contrast matrix, can you give me an example of how to do
it with 3 treatments?
Many Thanks!
Milena
More information about the Bioconductor
mailing list