[R] working with sparse matrix
Martin Maechler
maechler at stat.math.ethz.ch
Wed Jun 22 10:34:18 CEST 2011
>>>>> David Winsemius <dwinsemius at comcast.net>
>>>>> on Tue, 21 Jun 2011 12:07:47 -0400 writes:
> On Jun 21, 2011, at 11:44 AM, Patrick Breheny wrote:
>> On 06/21/2011 09:48 AM, davefrederick wrote:
>>> Hi, I have a 500x 53380 sparse matrix and I am trying to
>>> dichotomize it. Under sna package I have found event2dichot
>>> yet it doesnt recognize sparse matrix and requires adjacency
>>> matrix or array. I tried to run a simple loop code
>>>
>>> for (i in 1:500) for (j in 1:53380) if (matrix[i,j]>0)
>>> matrix[i,j]=1
>>
>> The code you are running does not require a loop:
>>
>> > A <- cbind(c(1,0),c(0,2))
>> > A
>> [,1] [,2]
>> [1,] 1 0
>> [2,] 0 2
>> > A[A>0] <- 1
>> > A
>> [,1] [,2]
>> [1,] 1 0
>> [2,] 0 1
>>
>> However, for large sparse matrices, this and other operations
>> will be faster if the matrix is explicitly stored as a sparse
>> matrix, as implemented in the 'Matrix' package.
> require(Matrix)
> M <- Matrix(0, 10,10) # empty sparse matrix
> M[1:10, 1] <- 1
> M[1:10, 2] <- 2
> M[1:10, 3] <- -3
> M[M > 0] <- 1
>> M
> 10 x 10 sparse Matrix of class "dgCMatrix"
> [1,] 1 1 -3 . . . . . . .
> [2,] 1 1 -3 . . . . . . .
> [3,] 1 1 -3 . . . . . . .
> [4,] 1 1 -3 . . . . . . .
> [5,] 1 1 -3 . . . . . . .
> [6,] 1 1 -3 . . . . . . .
> [7,] 1 1 -3 . . . . . . .
> [8,] 1 1 -3 . . . . . . .
> [9,] 1 1 -3 . . . . . . .
> [10,] 1 1 -3 . . . . . . .
>> M2 <- as.matrix(M)
>> M2
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> [1,] 1 1 -3 0 0 0 0 0 0 0
> [2,] 1 1 -3 0 0 0 0 0 0 0
> [3,] 1 1 -3 0 0 0 0 0 0 0
> [4,] 1 1 -3 0 0 0 0 0 0 0
> [5,] 1 1 -3 0 0 0 0 0 0 0
> [6,] 1 1 -3 0 0 0 0 0 0 0
> [7,] 1 1 -3 0 0 0 0 0 0 0
> [8,] 1 1 -3 0 0 0 0 0 0 0
> [9,] 1 1 -3 0 0 0 0 0 0 0
> [10,] 1 1 -3 0 0 0 0 0 0 0
> There might have been a one or two second pause while the last
> command was executing for this test:
>> M <- Matrix(0, 500, 53380 )
>> M[1:10, 1] <- 1
>> M[1:10, 2] <- 2
>> M[1:10, 3] <- -3
>> M[M > 0] <- 1
Yes, that's much better, thank you, David.
Just a short note: The above technique of
M <- Matrix(0, n, m)
M[i,j] <- v1
M[k,l] <- v2
...
maybe natural to produce a sparse matrix, and ok for such very
small examples, but it is typically *MUCH MUCH* less efficient than
directly using the
sparseMatrix()
or spMatrix()
functions which the Matrix package provides.
>> M2 <- as.matrix(M)
Why would you have to make your sparse matrix dense?
(It won't work for much larger matrices anyway: They can't exist
in your RAM as dense ..)...
Yes, I see you work with further code that does not accept
sparse matrices.
The glmnet package (lasso, etc) does it very nicely,
{and if you use package 'MatrixModels', with model.Matrix()
you can even use formulas to construct sparse (model) matrices
as input for glmnet !}.
After my imminent vacation {three weeks to Columbia and Ecuador!},
I'll gladly help package authors (eg of sna) to change their
code such that it should work with sparse matrices.
With regards,
Martin Maechler, ETH Zurich
> --
> David.
>>
>> --
>> Patrick Breheny Assistant Professor Department of Biostatistics
>> Department of Statistics University of Kentucky
> David Winsemius, MD
> West Hartford, CT
More information about the R-help
mailing list