[BioC] Multifactorial analysis with RMA and LIMMA of Affymetrix microarrays

Thu Aug 5 20:42:12 CEST 2004

What eventually was the correct design matrix for this dataset?

Peter

On Mar 17, 2004, at 10:48 AM, Jordi Altirriba Gutiérrez wrote:

> Thank you very much Gordon for your quick answer!
> My phenoData is:
>> pData(eset)
>     DIABETES TREATMENT
> DNT1     TRUE       FALSE
> DNT2     TRUE       FALSE
> DNT3     TRUE       FALSE
> DT1      TRUE        TRUE
> DT2      TRUE        TRUE
> DT3      TRUE        TRUE
> SNT1    FALSE       FALSE
> SNT2    FALSE       FALSE
> SNT3    FALSE       FALSE
> ST1     FALSE        TRUE
> ST2     FALSE        TRUE
> ST3     FALSE        TRUE
>
> (DNT=Diabetic untreated, DT=Diabetic treated, SNT=Health treated,  
> ST=Health untreated)
>
> I want to know the genes characteristics of the diabetes, the  
> treatment and the treatment + diabetes. Moreover when I analyse my  
> data with SAM and I compare Health treated vs the Health untreated I  
> don't see many differences, but when I compare the Diabetic treated vs  
> the Diabetic treated I see a lot of differences, so is correct to  
> apply a 2 x 2 factorial design?
> Is LIMMA the correct tool to answer my questions? If it is the correct  
> tool, how can I do a factorial design matrix (if to do a factorial  
> design is correct)? (Robert Gentleman has suggested me to use the  
> factDesign).
> Thank you very much for your time, patience and your suggestions.
> Yours sincerely,
>
>
>> From: Gordon Smyth <smyth at wehi.edu.au>
>> To: Jordi Altirriba Gutiérrez <altirriba at hotmail.com>
>> CC: bioconductor at stat.math.ethz.ch
>> Subject: Re: [BioC] Multifactorial analysis with RMA and LIMMA of   
>> Affymetrix microarrays
>> Date: Wed, 17 Mar 2004 11:32:16 +1100
>>
>> At 07:55 AM 17/03/2004, Jordi Altirriba Gutiérrez wrote:
>>> (Sorry, but I've had some problems with the HTML)
>>> Hello all!
>>> I am a beginner user of R and Bioconductor, sorry if my questions  
>>> have already been discussed previously.
>>> I am studying the effects of a new hypoglycaemic drug for the  
>>> treatment of diabetes and I have done this classical study:
>>> 4 different groups:
>>> 1.- Healthy untreated
>>> 2.- Healthy treated
>>> 3.- Diabetic untreated
>>> 4.- Diabetic treated
>>> With 3 biological replicates of each group, therefore I have done 12  
>>> arrays (Affymetrix).
>>> I have treated the raw data with the package RMA of Bioconductor  
>>> according to the article ”Exploration, normalization and summaries  
>>> of high density oligonucleotide array probe level data”  
>>> (Background=RMA, Normalization=quantiles, PM=PMonly,  
>>> Summarization=medianpolish).
>>> I am currently trying to analyse the object eset with the package  
>>> LIMMA of Bioconductor. I want to know what genes are differentially  
>>> expressed due to diabetes,  to the treatment and to the combination  
>>> of both (diabetes + treatment), being therefore an statistic  
>>> analysis similar to a two-ways ANOVA).
>>> So, my questions are:
>>> 1.- I have created a PhenoData in RMA, will the covariates of the  
>>> PhenoData have any influence in the analysis of LIMMA?
>>
>> Not automatically.
>>
>>> 2.- Are these commands correct to get these results? (see below) In  
>>> the command TopTable, the output of coef=1 are the genes  
>>> characteristics of diabetes?
>>
>> No.
>>
>>> 3.- If I do not see any effect of the treatment in the healthy  
>>> untreated rats should I design the matrix differently? Something  
>>> similar to a one-way-ANOVA, considering differently the four groups:
>>> ( > design<-model.matrix(~ -1+factor(c(1,1,1,2,2,2,3,3,3,4,4,4)))  ).
>>
>> This design matrix would be very much better, i.e., it would be a  
>> correct matrix. You could then use contrasts to test for differences  
>> and interaction terms between your four groups, and that would do the  
>> job.
>>
>> If you tell us what's in your phenoData slot, i.e., type pData(eset),  
>> then we might be able to suggest another approach analogous to the  
>> classical two-way anova approach.
>>
>> Gordon
>>
>>> 5.- Any other idea?
>>> Thank you very much for your time and your suggestions.
>>> Yours sincerely,
>>>
>>> Jordi Altirriba (PhD student, Hospital Clínic  IDIBAPS, Barcelona,  
>>> Spain)
>>>
>>>> design<- 
>>>> cbind("disease"=c(1,1,1,1,1,1,0,0,0,0,0,0),"treatment"=c(0,0,0,1,1,1 
>>>> ,0,0,0,1,1,1))
>>>> fit<-lmFit(eset,design)
>>>> contrast.matrix<-cbind("diabetes"=c(1,0),"drug"=c(0,1),"diabetes- 
>>>> drug"=c(1,1))
>>>> rownames(contrast.matrix)<-colnames(design)
>>>> design
>>>      disease treatment
>>> [1,]        1           0
>>> [2,]        1           0
>>> [3,]        1           0
>>> [4,]        1           1
>>> [5,]        1           1
>>> [6,]        1           1
>>> [7,]        0           0
>>> [8,]        0           0
>>> [9,]        0           0
>>> [10,]        0           1
>>> [11,]        0           1
>>> [12,]        0           1
>>>> contrast.matrix
>>>            diabetes drug diabetes-drug
>>> diabetes      1         0             1
>>> tratamiento   0         1             1
>>>> fit2<-contrasts.fit(fit,contrast.matrix)
>>>> fit2<-eBayes(fit2)
>>>> clas<-classifyTests(fit2)
>>>> vennDiagram(clas)
>>>> topTable(fit2,number=100,genelist=geneNames(eset),coef=1,adjust="fdr 
>>>> ")
>>>                     Name        M        t      P.Value        B
>>> 11590          1387471_at 19.32575 9.442926 2.743122e-17 29.95299
>>> 2500           1369951_at 19.24748 9.404683 2.743122e-17 29.66737
>>> 10652          1384778_at 19.22968 9.395981 2.743122e-17 29.60254
>>> …
>>>> sink("limma-diabetes.txt")
>>>> topTable(fit2,number=1000,genelist=geneNames(eset),coef=1,adjust="fd 
>>>> r")
>>>> sink()
>>
>
> _________________________________________________________________
> Reparaciones, servicios a domicilio, empresas, profesionales... Todo  
> en la guía telefónica de QDQ. http://qdq.msn.es/msn.cfm
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://www.stat.math.ethz.ch/mailman/listinfo/bioconductor
>