[BioC] Limma: Error with dupcor.series/duplicateCorrelatoin function

Valtteri Wirta valtteri at biotech.kth.se
Tue Mar 16 22:50:33 MET 2004


Hello Limma developers and users,

(sorry for the long posting)

Today I encountered a similar problem as the one that was reported by Jason 
Skelton.

I'm using Limma version 1.5.7 and R 1.8.1 on Windows XP platform.

I have 16 slides divided into four comparisons (four in each).
Each array contains 30 000 elements, representing 15 000 probes printed in 
duplicate (two identical fields => spacing=15000).
The M-values are stored in a matrix called Ms
Data for unreliable features is removed through filtration which causes 
some probes to have no valid measurements.

The modelMatrix (previously known as designMatrix) looks like this:
 > design
    HF HM LF LM
1   1  0  0  0
2   0  0  0  1
3   0  0  1  0
4   0  1  0  0
5   0  1  0  0
6   1  0  0  0
7   0  0  0  1
8   0  0  1  0
9   0  1  0  0
10  1  0  0  0
11  0  0  0  1
12  0 -1  0  0
13 -1  0  0  0
14  0  0  0 -1
15  0  0  1  0
16  0  0 -1  0

When estimating the within-slide correlation using dupcor.series (or 
duplicateCorrelation)
 > cor <- dupcor.series(Ms, design=design, ndups=2, spacing=15000)

the following error message is displayed:
Error in if (rho[i] < -1) rho[i] <- -1 : missing value where TRUE/FALSE needed

Ok. I replaced all NAs with 1 and now it works nicely.
It seems therefore that this has something to do with the NAs.

As the purpose of this step is to estimate the within-slide duplicate spot 
correlation, I thought that changing the design object into a new one 
considering all slides belonging to the same comparison should be ok.

 > cor <- dupcor.series(Ms, design=rep(1,16), ndups=2, spacing=15000)

And this works nicely. (Only the warning message: Max iterations exceeded 
in: glmgam.fit(dx, dy, start = c(mean(dy), 0)) is displayed, but that 
should not matter).
Do you think it is ok to do like this?


Now, back to the similarity with Jason's example:
I tried to further investigate the problem. Now I'm back to the old design 
with four comparisons (shown above) and for each comparison I change the 
M-value for all genes with no valid measurement on any of the four slides to 1.

 > cor <- dupcor.series(Ms.new, design=design, ndups=2, spacing=15000)

This works nicely.

This brings up the question of "how many valid measurements are needed?"
For genes with no valid measurement on any of the four slides I replaced 1 
of the measurements (always the first one) with 1.
Then

 > cor <- dupcor.series(Ms.new2, design=design, ndups=2, spacing=15000)

gives the following error:
  Error in if (rho[i] < -1) rho[i] <- -1 : missing value where TRUE/FALSE 
needed

When I change two of the four NAs to 1, this error is displayed:
Error in chol(XVX + lambda * I) : the leading minor of order 2 is not 
positive definite

Same goes for three of four, and four of four into 1 works nicely, as expected.

To make this even more difficult to understand... If I break out the four 
slides corresponding to the first comparison (HF in the design matrix 
above) the dupcor.series function works nicely also on the matrix containg 
the original NA values:

Ms.new3 <- Ms[,c(1,6,10,13)]
cor.3 <- dupcor.series(Ms.new3, design=c(1,1,1,-1), ndups=2, spacing=15000)


To conclude, it seems like there is a problem when a modelMatrix 
(designMatrix) is used in the dupcor.series/dupliceCorrelation function and 
that this is related to the number of valid measurements. I'd appreciate if 
someone could comment on this and perhaps explain what is wrong or what I 
am doing wrong

Finally, I have a small suggestions. Would it be possibly to have LIMMA 
display the version number when the package is loaded?

regards,
Valtteri




Contact information:

Valtteri Wirta
Department of Biotechnology, KTH
AlbaNova University Center
S - 10691 Stockholm, Sweden

Visiting address: Roslagstullsbacken 21, B3
Phone: +46 8 5537 8344(office)
Phone: +46 733 386 341 (gsm)
Fax: +46 8 5537 8481
Email: valtteri at biotech.kth.se
Web: http://biobase.biotech.kth.se/~valtteri/R



More information about the Bioconductor mailing list