[BioC] design matrix question
Marcelo Luiz de Laia
mlaia at fcav.unesp.br
Fri Feb 3 04:07:21 CET 2006
Hello,
After looking at the users guide, I have begun to analyse our data.
I found problems for to do a proper design matrix, with my biologist
background.
We are trying to analyse a 4x2x3 factorial design. We have 3 biological
replicates.
I made a search in the list archives and found a lot off messages about
design in limma, but I steel with doubts about it.
I would like to understand the design matrix.
For example, I have this targets in a pData (made by hand)
FileName Variedade Treatment Time
1 File01 Var1 Con T1
2 File02 Var1 Tra T1
3 File03 Var1 Con T1
4 File04 Var1 Tra T1
5 File05 Var1 Con T1
6 File06 Var1 Tra T1
7 File07 Var1 Con T2
8 File08 Var1 Tra T2
9 File09 Var1 Con T2
10 File10 Var1 Tra T2
11 File11 Var1 Con T2
12 File12 Var1 Tra T2
13 File13 Var1 Con T3
14 File14 Var1 Tra T3
15 File15 Var1 Con T3
16 File16 Var1 Tra T3
17 File17 Var1 Con T3
18 File18 Var1 Tra T3
(...)
72 File 72 Var4 Tra T3
Then, I made this design matrix, by hand:
design <-
model.matrix(~-1+factor(c(1,2,1,2,1,2,3,4,3,4,3,4,5,6,5,6,5,6,
7,8,7,8,7,8,9,10,9,10,9,10,11,12,11,12,11,12,
13,14,13,14,13,14,15,16,15,16,15,16,
17,18,17,18,17,18,19,20,19,20,19,20,
21,22,21,22,21,22,23,24,23,24,23,24)))
colnames(design) <- c("Ctrl_v1a1","Data_v1a1","Ctrl_v1a2","Data_v1a2",
"Ctrl_v1a3","Data_v1a3",
"Ctrl_v2a1","Data_v2a1","Ctrl_v2a2","Data_v2a2",
"Ctrl_v2a3","Data_v2a3",
"Ctrl_v3a1","Data_v3a1","Ctrl_v3a2","Data_v3a2",
"Ctrl_v3a3","Data_v3a3",
"Ctrl_v4a1","Data_v4a1","Ctrl_v4a2","Data_v4a2",
"Ctrl_v4a3","Data_v4a3")
I would like to know what is in the topTable function coef=1? With this
answer I will be able to understand what are in the others coef.
At this step am I impossible to get differently expressed genes for Var
1 on time 1? Tra is treated and Con is a control (not treated).
After this step, I made a contrast matrix in this way:
contrast.matrix <-
makeContrasts("Data_v1a1-Ctrl_v1a1","Data_v1a2-Ctrl_v1a2",
"Data_v1a3-Ctrl_v1a3",
"Data_v2a1-Ctrl_v2a1","Data_v2a2-Ctrl_v2a2",
"Data_v2a3-Ctrl_v2a3",
"Data_v3a1-Ctrl_v3a1","Data_v3a2-Ctrl_v3a2",
"Data_v3a3-Ctrl_v3a3",
"Data_v4a1-Ctrl_v4a1","Data_v4a2-Ctrl_v4a2",
"Data_v4a3-Ctrl_v4a3",
levels=design)
In this step I have the differentially expressed genes for Var 1 on time
1 in the coef=1 in topTable function?
With this design we are able to get the differentially expressed genes
in time 3 vs time 2? And for treated vs control?
I try to do another design based on targets:
design<-model.matrix(~Variedade*Treatment*Time, data=pData(targets))
In this case, I have a intercept and more 23 columns.
Is this case, what comparison are in the topTable functions coef=1? And
in coef=2? And in coef=23?
Need I to do a contrast matrix, too?
In this case I not need to makecontrasts for get the expressed genes for
Var 4 vs Var 1 in all times, i.e., considering the treated and control
included?
I guess if I solve these doubts I will made my design matrix correct.
So, I decide to ask for some advice from Bioconductor´s list.
I am interested in several aspects (contrasts), like differences between
Variedade types with treatment, and differences between Variedade types
without treatment (time excluded or included).
Which contrasts can answer these questions?
Thanks in advance for your assistance.
--
Marcelo Luiz de Laia
Ph.D Candidate
São Paulo State University (http://www.unesp.br/eng/)
School of Agricultural and Veterinary Sciences
Department of Technology
Via de Acesso Prof.Paulo Donato Castellane s/n
14884-900 Jaboticabal - SP - Brazil
Phone: +55-016-3209-2675
Cell: +55-016-97098526
More information about the Bioconductor
mailing list