[R-sig-ME] Representation of CHMfactor objects in Matrix package

Sat Jun 5 14:36:08 CEST 2010

On Fri, Jun 4, 2010 at 4:16 PM, Ryan King <c.ryan.king at gmail.com> wrote:
> Hi,
> I'm working a a mixed model problem which is slightly nonstandard, so
> I have to do a bit of optimization by hand.
>
> The general problem is that I have to repeatedly cholesky factor matrices like
>
> Identity + C L t(L) C
>
> where L is the cholesky root of a matrix which does not change each
> iteration, and C is a diagonal matrix which does depend on parameters.
>  C is not uniform, so I can not factor it out for a constant multiple
> of identity.
>
> I would like to use the fast update() method for CHMfactor objects
> produced by Cholesky() to take care of the Identity part, and the
> solve() method for the next stage after I decompose this matrix.
> Since C is diagonal, it should be very easy to scale the rows of L and
> still be a cholesky root.  The problem that I've run into is that I am
> not sure about the internal representation of CHMfactor objects.
>
> I understand the @x, @i, @p slots which code for some sparse matrix,
> but I don't know which matrix that is.
>
> How the @perms translate into the P matrix is also not clear to me.
> Many of the remaining slots do not appear in the manual.
>
> Sorry if this is just me being clumsy at S4.  Right now I just chol()
> the above matrix at each step, which is unnecessarily slow.

The representation of the Cholesky factor is based on the C struct
defined in the CHOLMOD package of C functions, which are included in
the sources for the Matrix package.  You could go through the
definition (in Matrix/src/CHOLMOD/Include/cholmod_core.h) and decide
how to update the structure but I think there is a simpler approach,
which is to update the sparse, positive definite matrix being factored
then update the factor.

It may seem that this is equivalent to using chol each time as you are
doing now but if you use the update method for a CHMfactor it bypasses
much of the effort in chol().  Most sparse matrix methods are
performed in two phase: a symbolic phase in which the number of
non-zeros in the result and their locations are calculated and a
numeric phase in which the actual values at these locations are
determined.  The update method only needs to perform the numeric
phase. Also, the update method allows you to add a multiple of the
identity matrix virtually during the decomposition process without
actually adding it to the matrix being decomposed.

If you are willing to use compiled code it is even faster because
there is a special scale operator that scales a symmetric matrix
symmetrically, i.e. your C %*% L %*% t(L) %*% C operation and a
special method for updating the decomposition with the virtual
multiple of the identity.