[BioC] Another limma factorial that needs controlling and pairing
Karl Brand
k.brand at erasmusmc.nl
Wed Aug 11 15:59:20 CEST 2010
Hi Sabrina,
Yes, i believe i did figure out a contrast matrix which allowed me to
use limma to indentify DE genes between two factors of interest
('Tissue' and 'Pperiod' in my case) whilst controlling for a third
factor ('Time' in my case where the samples were collected at different
times of the day as part of another question incorporated in the same
experiment but not discussed here). So to be clear, my goal was to find
DE genes between all pairwise comparisons of 6 data sets, namely, R.S,
R.E, R.L, C.S, C.E and C.L and also for any genes significant for
interaction between 'Tissue' and 'Pperiod'.
In my original post "[BioC] Another limma factorial that needs
controlling and pairing" i'd outlined an approach based closly on the
first example discussed in Gorodn Smyth's chapter 23*, pg412. In the end
this approach didn't work for me. Instead i succeeded using the
"treatment-contrast" parametrization (pg414) example. You'll find a
representation of the code i used for my analysis below. I haven't run
it, but for the purpose of illustration, together with my targets file
should suffice in explaining exactly how i used limma. The most
diffciult part for me was understanding the way R sets up this
"treatment-contrast" parametrization. To be sure about how i was setting
up the contrasts, i manually calculated the Log2FC for each contrast and
compared this with the limma output. This confirmation was critical for
a dilatante like myself to be sure my costrasts were indeed what i
thought they were. I would advise similar such corss checking of your
contrasts.
As you'll see from my code, i used 'decide tests' since overall
significance of genes across all contrasts was of greater help to me in
understanding the biology. As to which correction method you use, there
is plenty of online info (BioC archives, limma user guide) to clarify
this. In short, "global" seemed the best balanced method for my
question(s).
Final note on controlling for the paring of tissues (being obtained from
the same animal): i omitted this for simplicty in the code below. In
fact Prof. Smyth clarified out to deal with this using
duplicateCorrelation in a later BioC post if this is relevant for you;
and i can also repost if needed.
hth,
Karl
*http://www.statsci.org/smyth/pubs/limma-biocbook-reprint.pdf
###not run
library(limma)
targets <- read.delim("RNA_Targets.txt")
Tissue <- factor(targets$Tissue, levels = c("R", "C"))
Pperiod <- factor(targets$Pperiod, levels = c("E", "L", "S"))
Time <- factor(targets$Time)
design <- model.matrix(~Tissue * Pperiod + Time)
colnames(design)
fit <- lmFit(rma.pp, design)
cont.matrix <- cbind(
R.E_R.S = c(0,0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
R.E_R.L = c(0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
R.S_R.L = c(0,0,-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
C.E_C.S = c(0,0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1),
C.E_C.L = c(0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0),
C.S_C.L = c(0,0,-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,1),
R.S_C.S = c(0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1),
R.E_C.E = c(0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
R.L_C.L = c(0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0),
R.E_R.S__C.E_C.S = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1),#int
R.E_R.L__C.E_C.L = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0))#int.2
fit2 <- contrasts.fit(fit, cont.matrix)
fit3 <- eBayes(fit2)
#write resutls of contrasts for 0.05
results <- (decideTests(fit3, method="global", adjust.method="BH",
p.value=0.05))
write.fit(fit3, results, file="w.int.glob.BH.0.05.csv", sep=",")
#write resutls of contrasts for 0.01
results <- (decideTests(fit3, method="global", adjust.method="BH",
p.value=0.01))
write.fit(fit3, results, digits=10, file="w.int.glob.BH.0.01.csv", sep=",")
###end
"RNA_Targets.txt":
FileName Tissue Pperiod Time Animal
01file.CEL R S 1 1
02file.CEL C S 1 1
03file.CEL R S 2 2
04file.CEL C S 2 2
05file.CEL R S 3 3
06file.CEL C S 3 3
07file.CEL R S 4 4
08file.CEL C S 4 4
09file.CEL R S 5 5
10file.CEL C S 5 5
11file.CEL R S 6 6
12file.CEL C S 6 6
13file.CEL R S 7 7
14file.CEL C S 7 7
15file.CEL R S 8 8
16file.CEL C S 8 8
17file.CEL R S 9 9
18file.CEL C S 9 9
19file.CEL R S 10 10
20file.CEL C S 10 10
21file.CEL R S 11 11
22file.CEL C S 11 11
23file.CEL R S 12 12
24file.CEL C S 12 12
25file.CEL R S 13 13
26file.CEL C S 13 13
27file.CEL R S 14 14
28file.CEL C S 14 14
29file.CEL R S 15 15
30file.CEL C S 15 15
31file.CEL R S 16 16
32file.CEL C S 16 16
33file.CEL R E 1 17
34file.CEL C E 1 17
35file.CEL R E 2 18
36file.CEL C E 2 18
37file.CEL R E 3 19
38file.CEL C E 3 19
39file.CEL R E 4 20
40file.CEL C E 4 20
41file.CEL R E 5 21
42file.CEL C E 5 21
43file.CEL R E 6 22
44file.CEL C E 6 22
45file.CEL R E 7 23
46file.CEL C E 7 23
47file.CEL R E 8 24
48file.CEL C E 8 24
49file.CEL R E 9 25
50file.CEL C E 9 25
51file.CEL R E 10 26
52file.CEL C E 10 26
53file.CEL R E 11 27
54file.CEL C E 11 27
55file.CEL R E 12 28
56file.CEL C E 12 28
57file.CEL R E 13 29
58file.CEL C E 13 29
59file.CEL R E 14 30
60file.CEL C E 14 30
61file.CEL R E 15 31
62file.CEL C E 15 31
63file.CEL R E 16 32
64file.CEL C E 16 32
65file.CEL R L 1 33
66file.CEL C L 1 33
67file.CEL R L 2 34
68file.CEL C L 2 34
69file.CEL R L 3 35
70file.CEL C L 3 35
71file.CEL R L 4 36
72file.CEL C L 4 36
73file.CEL R L 5 37
74file.CEL C L 5 37
75file.CEL R L 6 38
76file.CEL C L 6 38
77file.CEL R L 7 39
78file.CEL C L 7 39
79file.CEL R L 8 40
80file.CEL C L 8 40
81file.CEL R L 9 41
82file.CEL C L 9 41
83file.CEL R L 10 42
84file.CEL C L 10 42
85file.CEL R L 11 43
86file.CEL C L 11 43
87file.CEL R L 12 44
88file.CEL C L 12 44
89file.CEL R L 13 45
90file.CEL C L 13 45
91file.CEL R L 14 46
92file.CEL C L 14 46
93file.CEL R L 15 47
94file.CEL C L 15 47
95file.CEL R L 16 48
96file.CEL C L 16 48
On 8/10/2010 4:21 PM, sabrina wrote:
>
>
>
> Hi, Karl:
> I have a similar question as yours, did you figure it out since your last post?
>
> Other question I have is about using topTable. In your previous
> design, did you use topTable for each contrast, meaning for each particular
> contrast, how you find the top list of genes? If that is the case, what is the
> best way to sort the genes? Did you use B value or adj.p? Or you just used
> decideTests?
>
> Thanks
>
> Sabrina
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Karl Brand <k.brand at erasmusmc.nl>
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3409 | F +31 (0)10 704 4743 | M +31 (0)642 777 268
More information about the Bioconductor
mailing list