[R] Large determinant problem

maj at stats.waikato.ac.nz maj at stats.waikato.ac.nz
Sun Dec 9 06:28:53 CET 2007


I thought I would have another try at explaining my problem. I think that
last time I may have buried it in irrelevant detail.

This output should explain my dilemma:

> dim(S)
[1] 1455  269
> summary(as.vector(S))
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max.
-1.160e+04  0.000e+00  0.000e+00 -4.132e-08  0.000e+00  8.636e+03
> sum(as.vector(S)==0)/(1455*269)
[1] 0.8451794
# S is a large moderately sparse matrix with some large elements
> SS <- crossprod(S,S)
> (eigen(SS,only.values = TRUE)$values)[250:269]
 [1]  9.264883e+04  5.819672e+04  5.695073e+04  1.948626e+04  1.500891e+04
 [6]  1.177034e+04  9.696327e+03  8.037049e+03  7.134058e+03  1.316449e-07
[11]  9.077244e-08  6.417276e-08  5.046411e-08  1.998775e-08 -1.268081e-09
[16] -3.140881e-08 -4.478184e-08 -5.370730e-08 -8.507492e-08 -9.496699e-08
# S'S fails to be non-negative definite.

I can't show you how to produce S easily but below I attempt at a
reproducible version of the problem:

> set.seed(091207)
> X <- runif(1455*269,-1e4,1e4)
> p <- rbinom(1455*269,1,0.845)
> Y <- p*X
> dim(Y) <- c(1455,269)
> YY <- crossprod(Y,Y)
> (eigen(YY,only.values = TRUE)$values)[250:269]
 [1] 17951634238 17928076223 17725528630 17647734206 17218470634 16947982383
 [7] 16728385887 16569501198 16498812174 16211312750 16127786747 16006841514
[13] 15641955527 15472400630 15433931889 15083894866 14794357643 14586969350
[19] 14297854542 13986819627
# No sign of negative eigenvalues; phenomenon must be due
# to special structure of S.
# S is a matrix of empirical parameter scores at an approximate
# mle for a model with 269 paramters fitted to 1455 observations.
# Thus, for example, its column sums are approximately zero:
> summary(apply(S,2,sum))
      Min.    1st Qu.     Median       Mean    3rd Qu.       Max.
-1.148e-03 -2.227e-04 -7.496e-06 -6.011e-05  7.967e-05  8.254e-04

I'm starting to think that it may not be a good idea to attempt to compute
large information matrices and their determinants!

Murray Jorgensen



More information about the R-help mailing list