[R] backsolve, chol, Matrix, and SparseM

Sat Sep 26 01:36:08 CEST 2015

Hi Martin,

Thanks for the remarks and examples, and for confirming that I was
indeed barking up the wrong tree with SparseM.

A. I assume that is a typo and you meant to say, no need for backsolve().
B. Absolutely; however, in this case I am taking advantage of
quadprog::solve.QP(..., factorized = TRUE) which requires the inverse of
the Cholesky factor; it turns out to be faster to compute this one time
upfront rather than have solve.QP(..., factorized = FALSE) do it over
and over again. Of course the holy grail would be a QP solver which
takes advantage of the various innovations from package:Matrix, but I
digress...
C. Agreed, assuming you are talking about Matrix::solve(X) on X of class
Matrix. On the other hand for a regular matrix x it is not difficult to
construct examples where backsolve(chol(x), diag(nrow(x))) is twice as
fast as base::solve(chol(x)), which led me down this path in the first
place.

By the way, is R-forge still the correct place to report bugs in
package:Matrix?

Regards
Ben

On 09/25/2015 04:25 AM, Martin Maechler wrote:
> Dear Ben,
>
>>>>>> Benjamin Tyner <btyner at gmail.com>
>>>>>>     on Thu, 24 Sep 2015 13:47:58 -0400 writes:
>     > Hi I have some code which does (on a symmetric matrix 'x')
>
>     >     backsolve(chol(x), diag(nrow(x)))
>
>     > and I am wondering what is the recommended way to
>     > accomplish this when x is also sparse (from
>     > package:Matrix). I know that package:Matrix provides a
>     > chol method for such matrices, but not a backsolve
>     > method. On the other hand, package:SparseM does provide a
>     > backsolve method, but doesn't actually return a sparse
>     > matrix. Moreover, I am a little hesitant to use SparseM,
>     > as the vignette seems to be from 2003.
>
> Roger Koenker has agreed in the past, that new projects should
> rather use Matrix.   SparseM has been the very first R package
> providing sparse matrix support.
>
>
>     > I did notice that help(topic = "solve", package =
>     > "Matrix") says "In ‘solve(a,b)’ in the ‘Matrix’ package,
>     > ‘a’ may also be a ‘MatrixFactorization’ instead of
>     > directly a matrix." which makes me think this is the right
>     > way:
>
>     >     Matrix::solve(Cholesky(x), .sparseDiagonal(nrow(x)))
>
>     > but unfortunately this didn't give the same result as:
>
>     >     Matrix::solve(chol(x), .sparseDiagonal(nrow(x)))
>
>     > so I'm asking here in case someone has any suggestions.
>
> You don't give any examples.
> So a few remarks and a reproducible example to get more concrete
>
> A. As the Matrix package has classes for triangular matrices and
>   Matrix :: chol() returns them, there   is no need for
>   forwardsolve() or backwardsolve(), as just   solve() is always
>   enough.
>
> B. As Doug Bates has been teaching for many decennia, "it is
>   almost always computationally *wrong* to compute a matrix
>   inverse explicitly".
>   Rather compute    A^{-1} B   or  A^{-1} x  {for vector x,
>   matrix B (but different from Identity).
>
> C. Inspite of B, there are cases (such as computing sandwich
>   estimates of covariance matrices) where you do want the inverse.
>   In that case,
>
>    solve(A)   		  is semantically equivalent to
>    solve(A, diag(.))
>
>    and almost always the *first* form is implempented more
>    efficiently than the second.
>
> D. In Matrix,  use chol(.) ... unless you really read a bit
>    about Cholesky(.) and its special purpose sparse cholesky decompositions.
>    As mentioned above,  Matrix :: chol()  will return a
>    "formally triangular" matrix, i.e., inheriting from
>    "triangularMatrix"; in the sparse case, very typically of
>    specific class "dtCMatrix".
>
> Here's a small reproducible example,
> please use it to ask further questions:
>
> *.R:
>
> library(Matrix)
> M <- as(diag(4)+1,"dsCMatrix")
> m <- as(M, "matrix") # -> traditional R matrix
> stopifnot( all(M == m) )
> M
> L <- Cholesky(M,super=TRUE,perm=FALSE) # a MatrixFactorization ("dCHMsuper")
> (L. <- as(L, "Matrix")) #-> lower-triagonal (sparseMatrix, specifically "dtCMatrix")
> (cM <- chol(M))# *upper* triagonal ("dtCMatrix")
> (cm <- chol(m))#  upper  triagonal traditional matrix -- the same "of course" :
> all.equal(as.matrix(cM), cm) # TRUE
>
> (r. <- backsolve(cm, diag(4)))# upper tri. (traditional) matrix
> (R. <-     solve(cM) ) ## the "same"  (but nicer printing)
> all.equal(as.matrix(R.), r., check.attributes=FALSE) # TRUE
> all( abs(R. - r.) <  1e-12 * mean(abs(R.))) # TRUE
>
> *.Rout:
>
>> M <- as(diag(4)+1,"dsCMatrix")
>> m <- as(M, "matrix") # -> traditional R matrix
>> stopifnot( all(M == m) )
>> M
> 4 x 4 sparse Matrix of class "dsCMatrix"
>             
> [1,] 2 1 1 1
> [2,] 1 2 1 1
> [3,] 1 1 2 1
> [4,] 1 1 1 2
>> L <- Cholesky(M,super=TRUE,perm=FALSE) # a MatrixFactorization ("dCHMsuper")
>> (L. <- as(L, "Matrix")) #-> lower-triagonal (sparseMatrix, specifically "dtCMatrix")
> 4 x 4 sparse Matrix of class "dtCMatrix"
>                                            
> [1,] 1.4142136 .         .         .       
> [2,] 0.7071068 1.2247449 .         .       
> [3,] 0.7071068 0.4082483 1.1547005 .       
> [4,] 0.7071068 0.4082483 0.2886751 1.118034
>> (cM <- chol(M))# *upper* triagonal ("dtCMatrix")
> 4 x 4 sparse Matrix of class "dtCMatrix"
>                                            
> [1,] 1.414214 0.7071068 0.7071068 0.7071068
> [2,] .        1.2247449 0.4082483 0.4082483
> [3,] .        .         1.1547005 0.2886751
> [4,] .        .         .         1.1180340
>> (cm <- chol(m))#  upper  triagonal traditional matrix -- the same "of course" :
>          [,1]      [,2]      [,3]      [,4]
> [1,] 1.414214 0.7071068 0.7071068 0.7071068
> [2,] 0.000000 1.2247449 0.4082483 0.4082483
> [3,] 0.000000 0.0000000 1.1547005 0.2886751
> [4,] 0.000000 0.0000000 0.0000000 1.1180340
>> all.equal(as.matrix(cM), cm) # TRUE
> [1] TRUE
>> (r. <- backsolve(cm, diag(4)))# upper tri. (traditional) matrix
>           [,1]       [,2]       [,3]       [,4]
> [1,] 0.7071068 -0.4082483 -0.2886751 -0.2236068
> [2,] 0.0000000  0.8164966 -0.2886751 -0.2236068
> [3,] 0.0000000  0.0000000  0.8660254 -0.2236068
> [4,] 0.0000000  0.0000000  0.0000000  0.8944272
>> (R. <-     solve(cM) ) ## the "same"  (but nicer printing)
> 4 x 4 sparse Matrix of class "dtCMatrix"
>                                                
> [1,] 0.7071068 -0.4082483 -0.2886751 -0.2236068
> [2,] .          0.8164966 -0.2886751 -0.2236068
> [3,] .          .          0.8660254 -0.2236068
> [4,] .          .          .          0.8944272
>> all.equal(as.matrix(R.), r., check.attributes=FALSE) # TRUE
> [1] TRUE
>> all( abs(R. - r.) <  1e-12 * mean(abs(R.))) # TRUE
> [1] TRUE