[BioC] problems in limma
Gordon K Smyth
smyth at wehi.EDU.AU
Thu Aug 2 02:36:28 CEST 2007
> Date: Tue, 31 Jul 2007 11:52:39 +0800 (CST)
> From: "De-Jian,ZHAO" <zhaodj at ioz.ac.cn>
> Subject: [BioC] problems in limma
> To: Bioconductor at stat.math.ethz.ch
>
> Dear list members,
>
> I am analysing my microarray data using limma package. Now I encounter
> several problems. Looking forward to your suggestions!
>
> Question 1:
> During the process of background correction using method="normexp", four
> warning messages appeared as "NaNs produced in: log(x)" (as you can see in
> the program posted below). What does that mean? How will it effect the
> final result? How could it be settled?
Probably this warning will affect your results only very slightly. However there is an update to
the normexp method in the developmental version of limma so that this warning is no longer
triggered.
Have you looked at your data (for example using MA-plots) to check that the data looks reasonable
after background correction?
> Question 2:
> On my microarray, every probe has two replicates.During the process of
> duplicateCorrelation, two warnings appear as "Too much damping -
> convergence tolerance not achievable" (as you also can see in the program
> posted below). What does it mean? Is there anything wrong with my data?
Please read the help page for duplicateCorrelation.
Again, have you looked at the value of corMA.pi$consensus to check that it is reasonable?
> Question 3:
> How to construct the design matrix is a puzzle to me. Here I constructed
> the design matrix using the function modelMatrix and the object targets.
> However, I am not sure whether it is constructed appropriately. Looking
> forward to your suggestions.
> (Additional info about my experimental design. Uppercase and lowercase
> words in the R object targets (see below in the posted program) have
> different meanings. The locusts on the plain [PLAIN] was treated [plain]
> in a simulated plateau environment while the locusts on the plateau
> [PLATEAU] was treated [plateau] in a simulated plain environment. They
> experienced different treatments. I think it is not a complete factorial
> design. Therefore I did not choose the design matrix for factorial
> designs. However, I do not know whether what I chose is appropriate.)
I'm not clear what your question is. limma treats "PLAIN" and "plain" as different, as you can
see from the output.
> Question 4:
> All in all, I wonder whether the differentially expressed genes produced
> via the posted program are convincing. Will the above-mentioned warnings
> affect the reliability of the final result? Can I continue to the next
> step?
The warnings you list do not appear to be a reason for concern. However, in any data analysis,
you need to check the reasonableness of your results at each step (using plots, summary etc), and
I'm not clear that you are doing this.
Best wishes
Gordon
> Thanks!
>
> Dejian Zhao
>
> ++++++++++++++++++ Program Starts +++++++++++++++++++++
>
>> library(limma)
>> library(statmod) #duplicateCorrelation requires this package.
>> targets<-readTargets()
>> targets
> Cy3 Cy5 FileName Date
> 1 PLAIN PLATEAU Locust 186.gpr 2006-5-31
> 2 PLAIN PLATEAU Locust 187.gpr 2006-5-31
> 3 PLAIN PLATEAU Locust 188.gpr 2006-5-31
> 4 PLAIN PLATEAU Locust 189.gpr 2006-5-31
> 5 PLAIN PLATEAU Locust 190.gpr 2006-5-31
> 6 PLAIN PLATEAU Locust 191.gpr 2006-5-31
> 7 plain PLAIN Locust 192.gpr 2006-6-6
> 8 plain PLAIN Locust 193.gpr 2006-6-6
> 9 plain PLAIN Locust 194.gpr 2006-6-6
> 10 plain PLAIN Locust 195.gpr 2006-6-6
> 11 plain PLAIN Locust 196.gpr 2006-6-6
> 12 plain PLAIN Locust 197.gpr 2006-6-6
> 13 plateau PLATEAU Locust 198.gpr 2006-6-8
> 14 plateau PLATEAU Locust 199.gpr 2006-6-8
> 15 plateau PLATEAU Locust 200.gpr 2006-6-8
> 16 plateau PLATEAU Locust 201.gpr 2006-6-8
> 17 plateau PLATEAU Locust 202.gpr 2006-6-8
> 18 plateau PLATEAU Locust 203.gpr 2006-6-8
>> RG<-read.maimages(targets,source="genepix",wt.fun=wtflags(0.1))
> Read Locust 186.gpr
> Read Locust 187.gpr
> Read Locust 188.gpr
> Read Locust 189.gpr
> Read Locust 190.gpr
> Read Locust 191.gpr
> Read Locust 192.gpr
> Read Locust 193.gpr
> Read Locust 194.gpr
> Read Locust 195.gpr
> Read Locust 196.gpr
> Read Locust 197.gpr
> Read Locust 198.gpr
> Read Locust 199.gpr
> Read Locust 200.gpr
> Read Locust 201.gpr
> Read Locust 202.gpr
> Read Locust 203.gpr
>> RG$genes<-readGAL()
>> spottypes<-readSpotTypes()
>> spottypes
> SpotType ID Name Color
> 1 gene * * black
> 2 blank Blank * brown
> 3 buffer *sc * blue
> 4 rice Os026* * green
> 5 beta-actin Beta* * red
> 6 18S 18S* * yellow
> 7 GAPDH GAPDH* * purple
>> RG$genes$Status<-controlStatus(spottypes,RG)
> Matching patterns for: ID Name
> Found 19200 gene
> Found 96 blank
> Found 220 buffer
> Found 192 rice
> Found 192 beta-actin
> Found 96 18S
> Found 96 GAPDH
> Setting attributes: values Color
>> RG.b<-backgroundCorrect(RG,method="normexp",offset=0)
> Corrected array 1
> Corrected array 2
> Corrected array 3
> Corrected array 4
> Corrected array 5
> Corrected array 6
> Corrected array 7
> Corrected array 8
> Corrected array 9
> Corrected array 10
> Corrected array 11
> Corrected array 12
> Corrected array 13
> Corrected array 14
> Corrected array 15
> Corrected array 16
> Corrected array 17
> Corrected array 18
> Warning messages:
> 1: NaNs produced in: log(x)
> 2: NaNs produced in: log(x)
> 3: NaNs produced in: log(x)
> 4: NaNs produced in: log(x)
>> w<-modifyWeights(RG$weights,RG$genes$Status,c("rice","beta-actin","18S","GAPDH"),c(0.1,2,2,2))
> MA.p<-normalizeWithinArrays(RG.b,weights=w,iterations=6)
>> design<-modelMatrix(targets,ref="PLAIN")
> Found unique target names:
> plain PLAIN plateau PLATEAU
>> design
> plain plateau PLATEAU
> [1,] 0 0 1
> [2,] 0 0 1
> [3,] 0 0 1
> [4,] 0 0 1
> [5,] 0 0 1
> [6,] 0 0 1
> [7,] -1 0 0
> [8,] -1 0 0
> [9,] -1 0 0
> [10,] -1 0 0
> [11,] -1 0 0
> [12,] -1 0 0
> [13,] 0 -1 1
> [14,] 0 -1 1
> [15,] 0 -1 1
> [16,] 0 -1 1
> [17,] 0 -1 1
> [18,] 0 -1 1
>> i<-MA.p$genes$Status=="gene"
>> corMA.pi<-duplicateCorrelation(MA.p[i,],design,ndups=2)
> Warning messages:
> 1: Too much damping - convergence tolerance not achievable in:
> glmgam.fit(dx, dy, start = start, tol = tol, maxit = maxit, trace = trace)
>
> 2: Too much damping - convergence tolerance not achievable in:
> glmgam.fit(dx, dy, start = start, tol = tol, maxit = maxit, trace = trace)
>
>> fitMA.pi<-lmFit(MA.p[i,],design,ndups=2,correlation=corMA.pi$consensus.correlation)
> contrast.matrix<-makeContrasts(plain,plateau-PLATEAU,PLATEAU,levels=design)
> contrast.matrix
> Contrasts
> Levels plain plateau - PLATEAU PLATEAU
> plain 1 0 0
> plateau 0 1 0
> PLATEAU 0 -1 1
>> colnames(contrast.matrix)<-cbind("plain-PLAIN","plateau-PLATEAU","PLATEAU-PLAIN")
> contrast.matrix
> Contrasts
> Levels plain-PLAIN plateau-PLATEAU PLATEAU-PLAIN
> plain 1 0 0
> plateau 0 1 0
> PLATEAU 0 -1 1
>> fit2MA.pi<-contrasts.fit(fitMA.pi,contrast.matrix)
>> fit2MA.pi<-eBayes(fit2MA.pi)
>> resultsi.001.fc2=decideTests(fit2MA.pi,method="nestedF",adjust.method="BH",p.value=0.001,lfc=log2(2))
> write.fit(fit2MA.pi,
> results=decideTests(fit2MA.pi,method="nestedF",adjust.method="BH",p.value=0.001,lfc=1),
> "fit2MA.pi.nestedF.adj_BH.P_001.FC_2.csv", digits=4, adjust="BH",
> sep=",")
>> save.image("result.RData")
>
> ++++++++++++++++++ Program Ends +++++++++++++++++++++
>
> End of Bioconductor Digest, Vol 53, Issue 31
> ********************************************
>
More information about the Bioconductor
mailing list