[R] backsolve, chol, Matrix, and SparseM

Fri Sep 25 10:25:02 CEST 2015

Dear Ben,

>>>>> Benjamin Tyner <btyner at gmail.com>
>>>>>     on Thu, 24 Sep 2015 13:47:58 -0400 writes:

    > Hi I have some code which does (on a symmetric matrix 'x')

    >     backsolve(chol(x), diag(nrow(x)))

    > and I am wondering what is the recommended way to
    > accomplish this when x is also sparse (from
    > package:Matrix). I know that package:Matrix provides a
    > chol method for such matrices, but not a backsolve
    > method. On the other hand, package:SparseM does provide a
    > backsolve method, but doesn't actually return a sparse
    > matrix. Moreover, I am a little hesitant to use SparseM,
    > as the vignette seems to be from 2003.

Roger Koenker has agreed in the past, that new projects should
rather use Matrix.   SparseM has been the very first R package
providing sparse matrix support.

    > I did notice that help(topic = "solve", package =
    > "Matrix") says "In ‘solve(a,b)’ in the ‘Matrix’ package,
    > ‘a’ may also be a ‘MatrixFactorization’ instead of
    > directly a matrix." which makes me think this is the right
    > way:

    >     Matrix::solve(Cholesky(x), .sparseDiagonal(nrow(x)))

    > but unfortunately this didn't give the same result as:

    >     Matrix::solve(chol(x), .sparseDiagonal(nrow(x)))

    > so I'm asking here in case someone has any suggestions.

You don't give any examples.
So a few remarks and a reproducible example to get more concrete

A. As the Matrix package has classes for triangular matrices and
  Matrix :: chol() returns them, there   is no need for
  forwardsolve() or backwardsolve(), as just   solve() is always
  enough.

B. As Doug Bates has been teaching for many decennia, "it is
  almost always computationally *wrong* to compute a matrix
  inverse explicitly".
  Rather compute    A^{-1} B   or  A^{-1} x  {for vector x,
  matrix B (but different from Identity).

C. Inspite of B, there are cases (such as computing sandwich
  estimates of covariance matrices) where you do want the inverse.
  In that case,

   solve(A)   		  is semantically equivalent to
   solve(A, diag(.))

   and almost always the *first* form is implempented more
   efficiently than the second.

D. In Matrix,  use chol(.) ... unless you really read a bit
   about Cholesky(.) and its special purpose sparse cholesky decompositions.
   As mentioned above,  Matrix :: chol()  will return a
   "formally triangular" matrix, i.e., inheriting from
   "triangularMatrix"; in the sparse case, very typically of
   specific class "dtCMatrix".

Here's a small reproducible example,
please use it to ask further questions:

*.R:

library(Matrix)
M <- as(diag(4)+1,"dsCMatrix")
m <- as(M, "matrix") # -> traditional R matrix
stopifnot( all(M == m) )
M
L <- Cholesky(M,super=TRUE,perm=FALSE) # a MatrixFactorization ("dCHMsuper")
(L. <- as(L, "Matrix")) #-> lower-triagonal (sparseMatrix, specifically "dtCMatrix")
(cM <- chol(M))# *upper* triagonal ("dtCMatrix")
(cm <- chol(m))#  upper  triagonal traditional matrix -- the same "of course" :
all.equal(as.matrix(cM), cm) # TRUE

(r. <- backsolve(cm, diag(4)))# upper tri. (traditional) matrix
(R. <-     solve(cM) ) ## the "same"  (but nicer printing)
all.equal(as.matrix(R.), r., check.attributes=FALSE) # TRUE
all( abs(R. - r.) <  1e-12 * mean(abs(R.))) # TRUE

*.Rout:

> M <- as(diag(4)+1,"dsCMatrix")
> m <- as(M, "matrix") # -> traditional R matrix
> stopifnot( all(M == m) )
> M
4 x 4 sparse Matrix of class "dsCMatrix"

[1,] 2 1 1 1
[2,] 1 2 1 1
[3,] 1 1 2 1
[4,] 1 1 1 2
> L <- Cholesky(M,super=TRUE,perm=FALSE) # a MatrixFactorization ("dCHMsuper")
> (L. <- as(L, "Matrix")) #-> lower-triagonal (sparseMatrix, specifically "dtCMatrix")
4 x 4 sparse Matrix of class "dtCMatrix"

[1,] 1.4142136 .         .         .       
[2,] 0.7071068 1.2247449 .         .       
[3,] 0.7071068 0.4082483 1.1547005 .       
[4,] 0.7071068 0.4082483 0.2886751 1.118034
> (cM <- chol(M))# *upper* triagonal ("dtCMatrix")
4 x 4 sparse Matrix of class "dtCMatrix"

[1,] 1.414214 0.7071068 0.7071068 0.7071068
[2,] .        1.2247449 0.4082483 0.4082483
[3,] .        .         1.1547005 0.2886751
[4,] .        .         .         1.1180340
> (cm <- chol(m))#  upper  triagonal traditional matrix -- the same "of course" :
         [,1]      [,2]      [,3]      [,4]
[1,] 1.414214 0.7071068 0.7071068 0.7071068
[2,] 0.000000 1.2247449 0.4082483 0.4082483
[3,] 0.000000 0.0000000 1.1547005 0.2886751
[4,] 0.000000 0.0000000 0.0000000 1.1180340
> all.equal(as.matrix(cM), cm) # TRUE
[1] TRUE
> (r. <- backsolve(cm, diag(4)))# upper tri. (traditional) matrix
          [,1]       [,2]       [,3]       [,4]
[1,] 0.7071068 -0.4082483 -0.2886751 -0.2236068
[2,] 0.0000000  0.8164966 -0.2886751 -0.2236068
[3,] 0.0000000  0.0000000  0.8660254 -0.2236068
[4,] 0.0000000  0.0000000  0.0000000  0.8944272
> (R. <-     solve(cM) ) ## the "same"  (but nicer printing)
4 x 4 sparse Matrix of class "dtCMatrix"

[1,] 0.7071068 -0.4082483 -0.2886751 -0.2236068
[2,] .          0.8164966 -0.2886751 -0.2236068
[3,] .          .          0.8660254 -0.2236068
[4,] .          .          .          0.8944272
> all.equal(as.matrix(R.), r., check.attributes=FALSE) # TRUE
[1] TRUE
> all( abs(R. - r.) <  1e-12 * mean(abs(R.))) # TRUE
[1] TRUE
>