[R] Sparse matrix no longer sparse (Matrix Package)
Doran, Harold
HDoran at air.org
Wed Jul 10 20:38:36 CEST 2013
I have zero'd in on what appears to be the issue. This seems to be a bug in Matrix, but I am not sure yet. I am attaching files that would allow others to replicate this with my toy data.
Notice the elements of D1 in the attached data are all integers. It is a sparse, diagonal matrix.
> library(Matrix)
> class(D1)
[1] "ddiMatrix"
attr(,"package")
[1] "Matrix"
Now, I find the inverse of the matrix A as follows:
> A <- Ir + ZtZ %*% D1
> A.Inv <- solve(A, Ir)
Notice now the inverse of A remains a dgCMatrix and it is relatively small in size, only 33424 bytes.
> class(A.Inv)
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"
> object.size(A.Inv)
33424 bytes
Now, if I change an element of the matrix D1 to be non-integer, D1 still has the same class as it did before
> D1[1] <- 1.2
> class(D1)
[1] "ddiMatrix"
attr(,"package")
[1] "Matrix"
Now, if I use this new version of D1 in the same calculations as above, notice that A.Inv is no longer a dgCMatrix but instead becomes a dgeMatrix. It then increases from an object of size 33424 bytes to an object of size 2001112 bytes!
> A <- Ir + ZtZ %*% D1
> A.Inv <- solve(A, Ir)
> class(A.Inv)
[1] "dgeMatrix"
attr(,"package")
[1] "Matrix"
> object.size(A.Inv)
2001112 bytes
What I desire is that the object A.Inv remain sparse at all times and not become dense. But, perhaps there is a reason this change occurs that I don't fully understand.
I can of course coerce it back to a sparse matrix and it reduces back in size.
> object.size(as(A.Inv, 'sparseMatrix'))
33424 bytes
I of course recognize it requires more memory to store floating points than integers, but is this large increase on the order of magnitude that seems about right?
Is there a reason the floating point in D1 causes for A.Inv to no longer remain sparse?
Thank you for your help,
Harold
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Doran, Harold
Sent: Wednesday, July 10, 2013 11:42 AM
To: r-help at r-project.org
Cc: dmbates at gmail.com; 'maechler at stat.math.ethz.ch'
Subject: [R] Sparse matrix no longer sparse (Matrix Package)
I have a large function computing an iterative algorithm for fitting mixed linear models. Almost all code relies on functions from the Matrix package. I've come across an issue that I do not believe previously occurred in earlier versions of R or Matrix.
I have a large, sparse matrix, A as
> class(A);dim(A)
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"
[1] 12312 12312
I am in a position where I must find its inverse. I realize this is less than ideal, and I have two ways of doing this
A.Inv <- solve(A, Ir) or just solve(A)
Where Ir is an identity matrix with the same dimensions as A and it is also sparse
> class(Ir)
[1] "ddiMatrix"
attr(,"package")
[1] "Matrix"
The issue, however, is that the inverse of A is converted into a dense matrix and this becomes a huge memory hog, causing the rest of the algorithm to fail. In prior versions this remained as a sparse matrix.
> A.Inv[1:5, 1:5]
5 x 5 Matrix of class "dgeMatrix"
[,1] [,2] [,3] [,4] [,5]
[1,] 0.6878713 0.0000000 0.0000000 0.0000000 0.0000000 [2,] 0.0000000 0.6718767 0.0000000 0.0000000 0.0000000 [3,] 0.0000000 0.0000000 0.5076945 0.0000000 0.0000000 [4,] 0.0000000 0.0000000 0.0000000 0.2324122 0.0000000 [5,] 0.0000000 0.0000000 0.0000000 0.0000000 0.2139975
I could coerce this matrix to become sparse such as
> AA <- as(A.Inv, 'sparseMatrix')
> class(AA)
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"
> AA[1:5, 1:5]
5 x 5 sparse Matrix of class "dgCMatrix"
[1,] 0.6878713 . . . .
[2,] . 0.6718767 . . .
[3,] . . 0.5076945 . .
[4,] . . . 0.2324122 .
[5,] . . . . 0.2139975
But I don't think this is best.
So, my question is why is a matrix that is sparse turning into a dense matrix? Can I avoid that and keep it sparse without having to coerce it to be sparse after it is created?
Thank you very much
Harold
> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lme4_0.999999-2 Matrix_1.0-12 lattice_0.20-15
loaded via a namespace (and not attached):
[1] grid_3.0.1 nlme_3.1-109 stats4_3.0.1 tools_3.0.1
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list