[R] Sparse matrix no longer sparse (Matrix Package)

Doran, Harold HDoran at air.org
Fri Jul 12 15:24:28 CEST 2013


Here is code to completely replicate the issue with comments. I remain confused why simply changing one element of the ddi matrix to be non-integer changes two things: 1) It changes the class of the object I need (A Inverse) and it increases its memory. 

Ideally, A inverse will remain stored as a sparse matrix no matter what (as it is sparse in my real world problem). When it is converted to a dense object, it blows up in memory and my R program halts.

library(Matrix)

### Create a symmetric matrix of class dsCMatrix
A <- diag(5, 10)
A[1, 5] <- A[5,1] <- 2

### Create a diagonal matrix of class ddi
D <- Diagonal(10, 50)

### This returns the inverse of A stored as a sparse matrix 
### In my real world problem it consumes almost no memory at all
### this is the ideal type
A <- A %*%D
(aa <- solve(A))
class(aa)
object.size(aa)

### Now, let's only change one element of D to be non-integer
D[1] <- 1.5

### Notice here the inverse of the matrix A
### is now stored as a different object class than before
### even though the pattern of 0s and non-zeros remains the same
### It also increases in memory size
### In my real world problem, the matrix increases from 
### about .03 megabytes to almost 2 megabytes and this causes R to choke and die

A <- A %*% D
(aa <- solve(A))
class(aa)
object.size(aa)

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jeff Newmiller
Sent: Thursday, July 11, 2013 3:22 PM
To: John Kane; r-help at r-project.org
Cc: dmbates at gmail.com; maechler at stat.math.ethz.ch
Subject: Re: [R] Sparse matrix no longer sparse (Matrix Package)

It seems to me that this issue should be reproducible with a small matrix, since the concern is the representation rather than the values.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.

John Kane <jrkrideau at inbox.com> wrote:

>Just about anything I knew about matrices, I forgot years ago so I'm no 
>help here but I'd suggest putting the matrix on something like 
>Mediafire http://www.mediafire.com/ or Dropbox https://www.dropbox.com 
>so people can download it and have a look.
>
>I agree that dput() is not really good for "big" data sets.  
>
>Kingston ON Canada
>
>
>> -----Original Message-----
>> From: hdoran at air.org
>> Sent: Thu, 11 Jul 2013 17:10:54 +0000
>> To: hdoran at air.org, jrkrideau at inbox.com, r-help at r-project.org
>> Subject: RE: [R] Sparse matrix no longer sparse (Matrix Package)
>> 
>> This is a terrible example as I didn't realize my code actually does 
>> create a non-symmetric matrix and in this case the function behaves
>as
>> expected. Nonetheless, my original issue stands and that issue still
>does
>> not make sense.
>> 
>> Apologies for bad example code.
>> 
>> -----Original Message-----
>> From: r-help-bounces at r-project.org
>[mailto:r-help-bounces at r-project.org]
>> On Behalf Of Doran, Harold
>> Sent: Thursday, July 11, 2013 11:36 AM
>> To: 'John Kane'; r-help at r-project.org
>> Cc: dmbates at gmail.com; maechler at stat.math.ethz.ch
>> Subject: Re: [R] Sparse matrix no longer sparse (Matrix Package)
>> 
>> Thank you, John. I originally used dput() but the output is huge.
>> However, here is a reproducible example of what I think very
>unexpected
>> behavior of some matrix functions.
>> 
>>> ### Create a symmetric matrix of class dsCMatrix A <- as(diag(5,
>10),
>>> 'dsCMatrix') A[1, 5] <- A[5,1] <- 2
>>> 
>>> ### Create a diagonal matrix of class ddi D <- Diagonal(10, 1)
>>> 
>>> ### This works as it should
>>> aa <- Cholesky(A %*% D)
>>> 
>>> ### Now, let's only change one element of D to be non-integer D[1]
><-
>>> 1.5
>>> 
>>> ### AD is still symmetric, but here the Cholesky function complains 
>>> that it is not aa <- Cholesky(A %*% D)
>> Error in Cholesky(as(A, "symmetricMatrix"), perm = perm, LDL = LDL,
>super
>> = super,  :
>>   error in evaluating the argument 'A' in selecting a method for
>function
>> 'Cholesky': Error in asMethod(object) :
>>   not a symmetric matrix; consider forceSymmetric() or symmpart()
>> 
>> ### For fun try this
>> 
>>> L <- update(aa, as(A %*% D, 'symmetricMatrix'))
>> Error in asMethod(object) :
>>   not a symmetric matrix; consider forceSymmetric() or symmpart()
>> 
>> ### This does indeed work, but should I need to implement this step?
>> 
>> Cholesky(forceSymmetric(A %*% D))
>> 
>> So, there is something about changing the elements of a ddi matrix
>that
>> causes subsequent problems. Is there a good reason this occurs and 
>> something I should be doing differently, or is this a bug?
>> 
>> Thanks
>> 
>> -----Original Message-----
>> From: John Kane [mailto:jrkrideau at inbox.com]
>> Sent: Thursday, July 11, 2013 10:57 AM
>> To: Doran, Harold; r-help at r-project.org
>> Subject: RE: [R] Sparse matrix no longer sparse (Matrix Package)
>> 
>> The message got through but not the attachment. The R help list tends
>to
>> strip off attachements for security reasons.  Files of types  txt,
>png, &
>> pdf should get through.
>> 
>> In most cases the accepted method of sending data is to use the
>dput()
>> function to output a file in the console and then copy and paste the 
>> results into your email.
>> 
>> So for file "dat1" one would just use dput(dat1) and paste the
>results
>> into an email.
>> 
>> John Kane
>> Kingston ON Canada
>> 
>> 
>>> -----Original Message-----
>>> From: hdoran at air.org
>>> Sent: Thu, 11 Jul 2013 09:53:40 +0000
>>> To: r-help at r-project.org
>>> Subject: [R] Sparse matrix no longer sparse (Matrix Package)
>>> 
>>> I sent this message yesterday with an attachment allowing for 
>>> reproduction of the issue. But I think the attachment is preventing 
>>> the message from coming through. If anyone is interested I will 
>>> forward the attachment directly allowing for you to reproduce the 
>>> issue I observe.
>>> 
>>> On 7/10/13 2:38 PM, "Doran, Harold" <HDoran at air.org> wrote:
>>> 
>>> >I have zero'd in on what appears to be the issue. This seems to be
>a
>>> >bug in Matrix, but I am not sure yet. I am attaching files that
>would
>>> >allow others to replicate this with my toy data.
>>>> 
>>> >Notice the elements of D1 in the attached data are all integers. It 
>>> >is a sparse, diagonal matrix.
>>>> 
>>>>> library(Matrix)
>>>>> class(D1)
>>> >[1] "ddiMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>> 
>>> >Now, I find the inverse of the matrix A as follows:
>>>>> A <- Ir + ZtZ %*% D1
>>>>> A.Inv <- solve(A, Ir)
>>>> 
>>> >Notice now the inverse of A remains a dgCMatrix and it is
>relatively
>>> >small in size, only 33424 bytes.
>>>>> class(A.Inv)
>>> >[1] "dgCMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>> 
>>>>> object.size(A.Inv)
>>> >33424 bytes
>>>> 
>>> >Now, if I change an element of the matrix D1 to be non-integer, D1 
>>> >still has the same class as it did before
>>>> 
>>>>> D1[1] <- 1.2
>>>> 
>>>>> class(D1)
>>> >[1] "ddiMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>> 
>>> >Now, if I use this new version of D1 in the same calculations as 
>>> >above, notice that A.Inv is no longer a dgCMatrix but instead
>becomes
>>> >a dgeMatrix. It then increases from an object of size 33424 bytes
>to
>>> >an object of size 2001112 bytes!
>>>> 
>>>>> A <- Ir + ZtZ %*% D1
>>>>> A.Inv <- solve(A, Ir)
>>>>> class(A.Inv)
>>> >[1] "dgeMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>>> object.size(A.Inv)
>>> >2001112 bytes
>>>> 
>>> >What I desire is that the object A.Inv remain sparse at all times
>and
>>> not
>>> >become dense. But, perhaps there is a reason this change occurs
>that
>>> >I don't fully understand.
>>>> 
>>> >I can of course coerce it back to a sparse matrix and it reduces
>back
>>> >in size.
>>>>>  object.size(as(A.Inv, 'sparseMatrix'))
>>> >33424 bytes
>>>> 
>>> >I of course recognize it requires more memory to store floating 
>>> >points than integers, but is this large increase on the order of 
>>> >magnitude that seems about right?
>>>> 
>>> >Is there a reason the floating point in D1 causes for A.Inv to no 
>>> >longer remain sparse?
>>>> 
>>> >Thank you for your help,
>>> >Harold
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> >-----Original Message-----
>>> >From: r-help-bounces at r-project.org
>>> >[mailto:r-help-bounces at r-project.org]
>>> >On Behalf Of Doran, Harold
>>> >Sent: Wednesday, July 10, 2013 11:42 AM
>>> >To: r-help at r-project.org
>>> >Cc: dmbates at gmail.com; 'maechler at stat.math.ethz.ch'
>>> >Subject: [R] Sparse matrix no longer sparse (Matrix Package)
>>>> 
>>> >I have a large function computing an iterative algorithm for
>fitting
>>> >mixed linear models. Almost all code relies on functions from the 
>>> >Matrix package. I've come across an issue that I do not believe 
>>> >previously occurred in earlier versions of R or Matrix.
>>>> 
>>> >I have a large, sparse matrix, A as
>>>> 
>>>>> class(A);dim(A)
>>> >[1] "dgCMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>> >[1] 12312 12312
>>>> 
>>> >I am in a position where I must find its inverse.  I realize this
>is
>>> less
>>> >than ideal, and I have two ways of doing this
>>>> 
>>> >A.Inv <- solve(A, Ir) or just solve(A)
>>>> 
>>> >Where Ir is an identity matrix with the same dimensions as A and it 
>>> >is also sparse
>>>> 
>>>>> class(Ir)
>>> >[1] "ddiMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>> 
>>> >The issue, however, is that the inverse of A is converted into a 
>>> >dense matrix and this becomes a huge memory hog, causing the rest
>of
>>> >the algorithm to fail. In prior versions this remained as a sparse
>>> matrix.
>>>> 
>>>>> A.Inv[1:5, 1:5]
>>> >5 x 5 Matrix of class "dgeMatrix"
>>>>          [,1]      [,2]      [,3]      [,4]      [,5]
>>> >[1,] 0.6878713 0.0000000 0.0000000 0.0000000 0.0000000 [2,]
>0.0000000
>>> >0.6718767 0.0000000 0.0000000 0.0000000 [3,] 0.0000000 0.0000000
>>> >0.5076945 0.0000000 0.0000000 [4,] 0.0000000 0.0000000 0.0000000
>>> >0.2324122 0.0000000 [5,] 0.0000000 0.0000000 0.0000000 0.0000000
>>> 0.2139975
>>>> 
>>> >I could coerce this matrix to become sparse such as
>>>> 
>>>>> AA <- as(A.Inv, 'sparseMatrix')
>>>>> class(AA)
>>> >[1] "dgCMatrix"
>>> >attr(,"package")
>>> >[1] "Matrix"
>>>> 
>>>>> AA[1:5, 1:5]
>>> >5 x 5 sparse Matrix of class "dgCMatrix"
>>>> 
>>> >[1,] 0.6878713 .         .         .         .
>>> >[2,] .         0.6718767 .         .         .
>>> >[3,] .         .         0.5076945 .         .
>>> >[4,] .         .         .         0.2324122 .
>>> >[5,] .         .         .         .         0.2139975
>>>> 
>>> >But I don't think this is best.
>>>> 
>>> >So, my question is why is a matrix that is sparse turning into a 
>>> >dense matrix? Can I avoid that and keep it sparse without having to 
>>> >coerce it to be sparse after it is created?
>>>> 
>>> >Thank you very much
>>> >Harold
>>>> 
>>>> 
>>>>> sessionInfo()
>>> >R version 3.0.1 (2013-05-16)
>>> >Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>> 
>>> >locale:
>>> >[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>>> >States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C 
>>> >[5] LC_TIME=English_United States.1252
>>>> 
>>> >attached base packages:
>>> >[1] stats     graphics  grDevices utils     datasets  methods  
>base
>>>> 
>>> >other attached packages:
>>> >[1] lme4_0.999999-2 Matrix_1.0-12   lattice_0.20-15
>>>> 
>>> >loaded via a namespace (and not attached):
>>> >[1] grid_3.0.1   nlme_3.1-109 stats4_3.0.1 tools_3.0.1
>>>> 
>>> >	[[alternative HTML version deleted]]
>>>> 
>>> >______________________________________________
>>> >R-help at r-project.org mailing list
>>> >https://stat.ethz.ch/mailman/listinfo/r-help
>>> >PLEASE do read the posting guide
>>> >http://www.R-project.org/posting-guide.html
>>> >and provide commented, minimal, self-contained, reproducible code.
>>> 
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>____________________________________________________________
>GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at 
>http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® 
>Messenger, ICQ®, Google Talk™ and most webmails
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list