[BioC] Another limma factorial that needs controlling and pairing

Wed Aug 11 15:59:20 CEST 2010

Hi Sabrina,

Yes, i believe i did figure out a contrast matrix which allowed me to 
use limma to indentify DE genes between two factors of interest 
('Tissue' and 'Pperiod' in my case) whilst controlling for a third 
factor ('Time' in my case where the samples were collected at different 
times of the day as part of another question incorporated in the same 
experiment but not discussed here). So to be clear, my goal was to find 
DE genes between all pairwise comparisons of 6 data sets, namely, R.S, 
R.E, R.L, C.S, C.E and C.L and also for any genes significant for 
interaction between 'Tissue' and 'Pperiod'.

In my original post "[BioC] Another limma factorial that needs 
controlling and	pairing" i'd outlined an approach based closly on the 
first example discussed in Gorodn Smyth's chapter 23*, pg412. In the end 
this approach didn't work for me. Instead i succeeded using the 
"treatment-contrast" parametrization (pg414) example. You'll find a 
representation of the code i used for my analysis below. I haven't run 
it, but for the purpose of illustration, together with my targets file 
should suffice in explaining exactly how i used limma. The most 
diffciult part for me was understanding the way R sets up this 
"treatment-contrast" parametrization. To be sure about how i was setting 
up the contrasts, i manually calculated the Log2FC for each contrast and 
compared this with the limma output. This confirmation was critical for 
a dilatante like myself to be sure my costrasts were indeed what i 
thought they were. I would advise similar such corss checking of your 
contrasts.

As you'll see from my code, i used 'decide tests' since overall 
significance of genes across all contrasts was of greater help to me in 
understanding the biology. As to which correction method you use, there 
is plenty of online info (BioC archives, limma user guide) to clarify 
this. In short, "global" seemed the best balanced method for my 
question(s).

Final note on controlling for the paring of tissues (being obtained from 
the same animal): i omitted this for simplicty in the code below. In 
fact Prof. Smyth clarified out to deal with this using 
duplicateCorrelation in a later BioC post if this is relevant for you; 
and i can also repost if needed.

hth,

Karl

*http://www.statsci.org/smyth/pubs/limma-biocbook-reprint.pdf

###not run
library(limma)
targets <- read.delim("RNA_Targets.txt")
Tissue <- factor(targets$Tissue, levels = c("R", "C"))
Pperiod <- factor(targets$Pperiod, levels = c("E", "L", "S"))
Time <- factor(targets$Time)

design <- model.matrix(~Tissue * Pperiod + Time)
colnames(design)

fit <- lmFit(rma.pp, design)
cont.matrix <- cbind(

R.E_R.S = c(0,0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
R.E_R.L = c(0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
R.S_R.L = c(0,0,-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),

C.E_C.S = c(0,0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1),
C.E_C.L = c(0,0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0),
C.S_C.L = c(0,0,-1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,1),

R.S_C.S = c(0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1),
R.E_C.E = c(0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
R.L_C.L = c(0,-1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0),

R.E_R.S__C.E_C.S = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1),#int
R.E_R.L__C.E_C.L = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,-1,0))#int.2

fit2 <- contrasts.fit(fit, cont.matrix)
fit3 <- eBayes(fit2)

#write resutls of contrasts for 0.05
results <- (decideTests(fit3, method="global", adjust.method="BH", 
p.value=0.05))
write.fit(fit3, results, file="w.int.glob.BH.0.05.csv", sep=",")

#write resutls of contrasts for 0.01
results <- (decideTests(fit3, method="global", adjust.method="BH", 
p.value=0.01))
write.fit(fit3, results, digits=10, file="w.int.glob.BH.0.01.csv", sep=",")

###end

"RNA_Targets.txt":

FileName    Tissue    Pperiod    Time    Animal
01file.CEL    R    S    1    1
02file.CEL    C    S    1    1
03file.CEL    R    S    2    2
04file.CEL    C    S    2    2
05file.CEL    R    S    3    3
06file.CEL    C    S    3    3
07file.CEL    R    S    4    4
08file.CEL    C    S    4    4
09file.CEL    R    S    5    5
10file.CEL    C    S    5    5
11file.CEL    R    S    6    6
12file.CEL    C    S    6    6
13file.CEL    R    S    7    7
14file.CEL    C    S    7    7
15file.CEL    R    S    8    8
16file.CEL    C    S    8    8
17file.CEL    R    S    9    9
18file.CEL    C    S    9    9
19file.CEL    R    S    10    10
20file.CEL    C    S    10    10
21file.CEL    R    S    11    11
22file.CEL    C    S    11    11
23file.CEL    R    S    12    12
24file.CEL    C    S    12    12
25file.CEL    R    S    13    13
26file.CEL    C    S    13    13
27file.CEL    R    S    14    14
28file.CEL    C    S    14    14
29file.CEL    R    S    15    15
30file.CEL    C    S    15    15
31file.CEL    R    S    16    16
32file.CEL    C    S    16    16
33file.CEL    R    E    1    17
34file.CEL    C    E    1    17
35file.CEL    R    E    2    18
36file.CEL    C    E    2    18
37file.CEL    R    E    3    19
38file.CEL    C    E    3    19
39file.CEL    R    E    4    20
40file.CEL    C    E    4    20
41file.CEL    R    E    5    21
42file.CEL    C    E    5    21
43file.CEL    R    E    6    22
44file.CEL    C    E    6    22
45file.CEL    R    E    7    23
46file.CEL    C    E    7    23
47file.CEL    R    E    8    24
48file.CEL    C    E    8    24
49file.CEL    R    E    9    25
50file.CEL    C    E    9    25
51file.CEL    R    E    10    26
52file.CEL    C    E    10    26
53file.CEL    R    E    11    27
54file.CEL    C    E    11    27
55file.CEL    R    E    12    28
56file.CEL    C    E    12    28
57file.CEL    R    E    13    29
58file.CEL    C    E    13    29
59file.CEL    R    E    14    30
60file.CEL    C    E    14    30
61file.CEL    R    E    15    31
62file.CEL    C    E    15    31
63file.CEL    R    E    16    32
64file.CEL    C    E    16    32
65file.CEL    R    L    1    33
66file.CEL    C    L    1    33
67file.CEL    R    L    2    34
68file.CEL    C    L    2    34
69file.CEL    R    L    3    35
70file.CEL    C    L    3    35
71file.CEL    R    L    4    36
72file.CEL    C    L    4    36
73file.CEL    R    L    5    37
74file.CEL    C    L    5    37
75file.CEL    R    L    6    38
76file.CEL    C    L    6    38
77file.CEL    R    L    7    39
78file.CEL    C    L    7    39
79file.CEL    R    L    8    40
80file.CEL    C    L    8    40
81file.CEL    R    L    9    41
82file.CEL    C    L    9    41
83file.CEL    R    L    10    42
84file.CEL    C    L    10    42
85file.CEL    R    L    11    43
86file.CEL    C    L    11    43
87file.CEL    R    L    12    44
88file.CEL    C    L    12    44
89file.CEL    R    L    13    45
90file.CEL    C    L    13    45
91file.CEL    R    L    14    46
92file.CEL    C    L    14    46
93file.CEL    R    L    15    47
94file.CEL    C    L    15    47
95file.CEL    R    L    16    48
96file.CEL    C    L    16    48

On 8/10/2010 4:21 PM, sabrina wrote:
>
>
>
> Hi, Karl:
> I have a similar question as yours, did you figure it out since your last post?
>
> Other question I have is about using topTable. In your previous
> design, did you use topTable for each contrast, meaning for each particular
> contrast, how you find the top list of genes? If that is the case, what is the
> best way to sort the genes? Did you use B value or adj.p? Or you just used
> decideTests?
>
> Thanks
>
> Sabrina
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Karl Brand <k.brand at erasmusmc.nl>
Department of Genetics
Erasmus MC
Dr Molewaterplein 50
3015 GE Rotterdam
P +31 (0)10 704 3409 | F +31 (0)10 704 4743 | M +31 (0)642 777 268