[BioC] question about the error from function intraspotCorrelation
Gordon Smyth
smyth at wehi.edu.au
Fri Aug 6 08:52:25 CEST 2004
Dear Ren,
Thanks for sending me your data. You haven't made a mistake. It turns out
that intraspotCorrelation() returns an error for one of your genes,
although it runs for all the others. intraspotCorrelation() depends on an
interative REML algorithm - when I get a chance I will look at the
algorithm and try to trap floating point errors so they don't cause the
function to fail in this way.
This is your targets setup:
targets <- readTargets()
> targets
FileName Cy3 Cy5
1436 1436.spot young26 old20
1437 1437.spot old20 young26
1438 1438.spot young27 old21
1439 1439.spot old21 young27
1440 1440.spot young28 old22
1441 1441.spot old22 young28
1442 1442.spot young29 old23
1443 1443.spot old23 young29
1444 1444.spot young30 old25
1445 1445.spot old25 young30
I understand that your reason for doing a single-channel analysis is so
that you can handle the technical replication by fitting an individual
effect for each mouse. Here are my results for your data:
targets.sc <- targetsA2C(targets)
# I include an effect for each mouse plus a dye effect
design.sc <-
model.matrix(~0+factor(targets.sc$Target)+factor(targets.sc$channel))
# intraspotCorrelation() fails for gene 86, so I fit to all other genes
corfit <- intraspotCorrelation(MA[-86,],design.sc)
#> corfit$consensus
#[1] 0.8627374 # intra-spot correlations are typically 0.8-0.95.
fit <- lmscFit(MA,design.sc,correlation=corfit$consensus)
# The contrast of interest is the average difference between old and young mice
cont.matrix <- cbind(OldvsYoung=c(1,1,1,1,1,-1,-1,-1,-1,-1,0)/5)
fit2 <- contrasts.fit(fit,cont.matrix)
fit2 <- eBayes(fit2)
tab <- topTable(fit2,adjust="fdr")
tab[,c("ID","Symbol","Status","M","A","t","P.Value","B")]
ID Symbol
Status M A t P.Value B
314 AK020608 9530053H05Rik oligo 0.7875154 9.655528 16.224403
2.055509e-07 12.971211
150 NM_024197 Ndufa10 oligo -0.5067656 10.951221 -16.197897
2.055509e-07 12.951719
462 BC009097 1110003E01Rik oligo -0.6551344 10.555844 -15.821665
2.055509e-07 12.671381
244 NM_008458 Serpina3c oligo -0.6652694 13.860417 -14.852744
3.034365e-07 11.916406
28 NM_008147 Gp49b oligo 0.5659843 8.584803 14.668451
3.034365e-07 11.767129
164 AK015780 4930513F16Rik oligo 0.5359480 8.959335 14.255934
3.544077e-07 11.426037
91 BC017687 Uqcrc1 oligo -0.5388312 10.798586 -13.204708
7.481306e-07 10.510645
113 NM_007977 F8 oligo 0.5478736 8.822306 11.668914
2.752735e-06 9.039561
337 NM_007763 Crip1 oligo 0.4535503 9.020447 11.008615
4.769757e-06 8.351686
195 NM_023065 Ifi30 oligo 0.6385894 7.613664 9.759321
1.669688e-05 6.945506
Actually you could also analyse this data using log-ratios and instead
handle the blocking structure using a random effects model. Here is how
that would run in limma:
design <- cbind(Dye=1,OldvsYoung=c(1,-1,1,-1,1,-1,1,-1,1,-1))
pair <- c(1,1,2,2,3,3,4,4,5,5)
corfit <- duplicateCorrelation(MA,design,block=pair)
#> corfit$consensus
#[1] -0.3769877 There is a reasonable within-block correlation. The
correlation is negative because of the dye-swapping.
fit <- lmFit(MA,design,block=pair,correlation=corfit$consensus)
fit <- eBayes(fit)
tab <- topTable(fit,coef="OldvsYoung",adjust="fdr")
tab[,c("ID","Symbol","Status","M","A","t","P.Value","B")]
ID Symbol
Status M A t P.Value B
314 AK020608 9530053H05Rik oligo 0.7829186 9.655528 10.280217
3.135044e-05 8.5213432
150 NM_024197 Ndufa10 oligo -0.5093503 10.951221 -6.963341
1.602865e-03 4.1907406
462 BC009097 1110003E01Rik oligo -0.6629056 10.555844 -6.323564
3.047635e-03 3.1488941
91 BC017687 Uqcrc1 oligo -0.5281134 10.798586 -6.111562
3.275146e-03 2.8236443
333 BC004749 Hagh oligo -0.4918728 10.774405 -5.102169
1.514349e-02 1.1027534
497 AK011827 2610108D09Rik oligo 0.3766103 11.110562 4.994212
1.514349e-02 0.8217170
164 AK015780 4930513F16Rik oligo 0.5131960 8.959335 4.872980
1.514349e-02 0.7170266
57 AK013971 Tex261 oligo -0.3951064 10.430905 -4.888604
1.514349e-02 0.6787478
244 NM_008458 Serpina3c oligo -0.6595177 13.860417 -4.691885
1.894933e-02 0.2713318
113 NM_007977 F8 oligo 0.5285670 8.822306 4.600728
2.028759e-02 0.1105870
Notice that the log-ratio analysis results are very similar to the
single-channel results but are somewhat less significant. The M-values from
the two analysis are almost the same - but not exactly identical because
you have quantitative spot weights set. My guess is that the significance
levels from the log-ratio analysis are more reliable than those from the
single-channel analysis, but I can't give you a firm reason.
Gordon
At 06:59 AM 3/08/2004, Ren Na wrote:
>Gordon Smyth wrote:
>
>>At 04:50 AM 31/07/2004, Ren Na wrote:
>>
>>>I tried to do individual channel analysis of two-colour microarray data.
>>>I sort of followed the example in function lmscFit. When I ran to the
>>>the following step, I got the error,
>>>corfit <- intraspotCorrelation(MA, design)
>>>Error in if (dev < devold - 1e-15) break :
>>> missing value where TRUE/FALSE needed
>>>
>>>I checked M and A, there is no missing value. Does anyone know what
>>>cause this error?
>>
>>
>>The error message is from the function remlscore() in the statmod
>>package, and indicates a failure of the iterative algorithm used to fit a
>>REML model. remlscore() is usually very reliable. One possible
>>explanation is that your data is degenerate in some way so that a model
>>can't be sensibly fit. If you provide a reproducible example of the
>>error, I will have a look at it.
>>
>>Gordon
>>
>>>Thank you very much.
>>
>Dr. Gordon Smyth,
>Thank you very much for replying my message. I really appreciate it.
>My mouse microarray data is five pairs of dye swaps,
>young26(Cy3) vs old20(Cy5); old20(Cy3)-young26(Cy5)
>young27(Cy3) vs old21(Cy5); old21(Cy3)-young27(Cy5)
>young28(Cy3) vs old22(Cy5); old22(Cy3)-young28(Cy5)
>young29(Cy3) vs old23(Cy5); old23(Cy3)-young29(Cy5)
>young30(Cy3) vs old25(Cy5); old25(Cy3)-young30(Cy5)
>I tried to create the topTable, find significantly expressed genes
>between young mice and old mice.
>Here are the steps I did:
> > library(limma)
> > targets <- readTargets("Targets.txt")
> > RG<-read.maimages(targets$FileName, source="spot",wt.fun=wtarea(100))
> > RG$genes<- readGAL("Mouse24052004.txt")
> > RG$printer<-getLayout(RG$genes)
> > spottypes<-readSpotTypes("spottypes.txt")
> > RG$genes$Status<- controlStatus(spottypes, RG$genes)
> > RG<-backgroundCorrect(RG,method="minimum")
> > MA<-normalizeWithinArrays(RG)
> > MA<-normalizeBetweenArrays(MA, method="Aq")
> > targets.sc <- targetsA2C(targets)
> > targets.sc$Target <- factor(targets.sc$Target, levels=c("young26",
> "old20", "young29", "old23", "young30", "old25", "young28", "old22",
> "young27", "old21"))
> > design <- model.matrix(~ -1+Target, data=targets.sc)
> > corfit <- intraspotCorrelation(MA, design)
>Error in if (dev < devold - 1e-15) break :
> missing value where TRUE/FALSE needed
>In addition: Warning message:
>reml: Max iterations exceeded in: remlscore(y, X, Z)
>For making the files "MA" and "design" small, I randomly choose 500 genes
>like the following:
>i <- sample(1: nrow(MA), 500)
>MA <- MA[i,]
>Then I use the function dput to save this "MA" object and the object
>"design" in the files "MA" and "design" , so you can use
>these files to run the function intraspotCorrelation. Please see the
>attachment for these two files.
>
>I am also not very clear about the contrast matrix. For my data, what
>is the right way to create the contrast matrix, and could you
>tell me where I can find some material about how to make
>design matrix and contrast matrix?
>
>Thanks again, I am looking forward to hearing from you.
>
>Sincerely,
>
>Ren
>
>
>[dput material deleted]
More information about the Bioconductor
mailing list