[R] how to invert the matrix with quite small eigenvalues

Thomas Lumley tlumley at u.washington.edu
Mon May 30 18:24:43 CEST 2005

On Mon, 30 May 2005, huang min wrote:>
> My intention is to invert the covariance matrix to perform some
> algorithm which is common in the estimating equations like GEE.

In that case there is no benefit in being able to invert very extreme 
covariance matrices. The asymptotic approximations to the distribution of 
regression coefficients will break down really badly with such extreme 
working covariance matrices.

I think in a case like this you should either
1) report an error and stop
2) shrink the covariance matrix towards a diagonal one, eg increase the 
diagonal entries until the condition number becomes reasonable.
3) Use a one-step estimator from the independence working model (which is 
asymptotically equivalent to the full solution and better behaved).

Remember that in GEE the matrix V^{-1} is just a set of weights, chosen to 
get good efficiency.   Your matrix solve(a) is not a good set of weights.

I think in an earlier thread on this topic Brian Ripley recommended using 
the singular value decomposition if you really have to compute something 
like D^TV^{-1}D.  In your example this still isn't good enough.

> Another strange point is that my friend use the LU decomposition in
> Fortran to solve the inverse matrix of aa for me. He used double
> precision of course, otherwise no inverse matrix in Fortran too. He
> show that the product of the two matrix is almost identity(The biggest
> off-digonal element is about 1e-8). I copy his inverse matrix(with 31
> digits!) to R and read in aa I sent to him(16 digits). The product is
> also not an identity matrix. It is fairly strange! Any suggestion?

It isn't that strange.  The system is computationally singular, meaning 
that you should expect to get different answers with apparently similar 
computations on different systems.

Also remember that what you care about for GEE is the result of 
solve(a,y-mu), rather than solve(a,a). Getting one of these right is no 
guarantee of getting the other one right.

If you really had to work with this matrix in double precision you would 
need to track very carefully the error bounds on all your computations, 
which would be very difficult.  Fortunately this is almost never necessary 
in statistics, and I don't think it's necessary in your case.

A good habit to get into when you aren't tracking error bounds carefully 
is to think of the last couple of digits of the result of any calculation 
as random.


More information about the R-help mailing list