[R] Matrix package: band matrix
Thomas Lumley
tlumley at u.washington.edu
Sun Feb 22 12:00:52 CET 2009
Exactly what I wanted. Thanks much.
-thomas
On Sat, 21 Feb 2009, Martin Maechler wrote:
>>>>>> "KK" == Ken Knoblauch <ken.knoblauch at inserm.fr>
>>>>>> on Fri, 20 Feb 2009 14:11:55 +0000 (UTC) writes:
>
> KK> Thomas Lumley <tlumley <at> u.washington.edu> writes:
> >> I want to construct a symmetric band matrix in the Matrix
> >> package from a matrix where the first column contains
> >> data for the main diagonal, the second column has data
> >> for the first subdiagonal/superdiagonal and so on.
> >>
> >> Since the Matrix will be 10^5 x 10^5 or so, with perhaps
> >> 10-20 non-zero elements above the diagonal per row, I
> >> can't do it by constructing a full matrix and then using
> >> the band() function to subset it to a band matrix.
> >>
> >> Any suggestions?
> >>
> >> -thomas
> >>
> >> Thomas Lumley Assoc. Professor, Biostatistics tlumley
> >> <at> u.washington.edu University of Washington, Seattle
> >>
> KK> I'm not expert but just looking at the package, perhaps
> KK> the spMatrix function (constructor) would do it.
>
>
> KK> spMatrix(5, 5, c(2:5, 1:5, 1:4), c(1:4, 1:5, 2:5), rnorm(13))
>
> KK> 5 x 5 sparse Matrix of class "dgTMatrix"
>
> KK> [1,] -1.7109379 -0.5576860 . . .
> KK> [2,] -0.3219899 -0.9294558 0.001323253 . .
> KK> [3,] . 0.4099070 0.646068453 0.5388224 .
> KK> [4,] . . -1.007848357 -0.1117796 -0.5555834
> KK> [5,] . . . -0.6816979 -0.6052127
>
> Yes, very good, Ken!
>
> My solution uses the successor of spMatrix(), called
> sparseMatrix(), unless in the symmetric case, where it class
> new("dsTMatrix", ...) directly:
>
> bandSparse <- function(n, m = n, k, listDiag, symmetric = FALSE)
> {
> ## Purpose: Compute a band-matrix by speciyfying its (sub-)diagonal(s)
> ## ----------------------------------------------------------------------
> ## Arguments: (n,m) : Matrix dimension
> ## k : integer vector of "diagonal numbers", with identical
> ## meaning as in band(*, k)
> ## listDiag: list of (sub)diagonals
> ## symmetric: if TRUE, specify only upper or lower triangle;
> ## ----------------------------------------------------------------------
> ## Author: Martin Maechler, Date: 20 Feb 2009, 22:42
> use.x <- !missing(listDiag)
> len.k <- length(k)
> stopifnot(is.list(listDiag),
> k == as.integer(k), n == as.integer(n), m == as.integer(m))
> if(use.x && length(listDiag) != len.k)
> stop(sprintf("'listDiag' must have the same length (%d) as 'k'", len.k))
> k <- as.integer(k)
> n <- as.integer(n)
> m <- as.integer(m)
> stopifnot(n >= 0, m >= 0, -n+1 <= k, k <= m - 1)
> if(symmetric && any(k < 0) && any(k > 0))
> stop("for symmetric band matrix, only specify upper or lower triangle",
> "\n hence, all k must have the same sign")
> dims <- c(n,m)
>
> k.lengths <- ## This is a bit "ugly"; I got the cases "by inspection"
> if(n >= m) {
> ifelse(k >= m-n, m - pmax(0,k), n+k)
> } else { ## n < m
> ifelse(k >= -n+1, n + pmin(0,k), m-k)
> }
> i <- j <- integer(sum(k.lengths))
> if(use.x)
> x <- if(len.k > 0) # carefully getting correct type/mode
> rep.int(listDiag[[1]][1], length(i))
> off.i <- 0L
> for(s in seq_len(len.k)) {
> kk <- k[s] ## *is* integer
> l.kk <- k.lengths[s] ## == length of (sub-)diagonal kk
> ii1 <- seq_len(l.kk)
> ind <- ii1 + off.i
> if(kk >= 0) {
> i[ind] <- ii1
> j[ind] <- ii1 + kk
> } else { ## k < 0
> i[ind] <- ii1 - kk
> j[ind] <- ii1
> }
> if(use.x) {
> xx <- listDiag[[s]]
> if(length(xx) < l.kk)
> warning(sprintf("the %d-th (sub)-diagonal (k = %d) is %s",
> s, kk, "too short; filling with NA's"))
> x[ind] <- xx[ii1]
> }
> off.i <- off.i + l.kk
> }
> if(symmetric) { ## we should have smarter sparseMatrix()
> UpLo <- if(min(k) >= 0) "U" else "L"
> as(if(use.x) {
> if(is.integer(x)) x <- as.double(x)
> new("dsTMatrix", i= i-1L, j = j-1L, x = x, Dim= dims, uplo=UpLo) } else
> new("nsTMatrix", i= i-1L, j = j-1L, Dim= dims, uplo=UpLo),
> "CsparseMatrix")
> }
> else { ## general, not symmetric
> if(use.x)
> sparseMatrix(i=i, j=j, x=x, dims=dims)
> else
> sparseMatrix(i=i, j=j, dims=dims)
> }
> }
>
> if(FALSE) { ## Examples
>
> dList <- list(1:30, 10*(1:20), 100*(1:20))
>
> s1 <- bandSparse(13, k = -c(0:2, 6), lis = c(dList, dList[2]), symm=TRUE)
> s2 <- bandSparse(13, k = c(0:2, 6), lis = c(dList, dList[2]), symm=TRUE)
> stopifnot(identical(s1, t(s2)), is(s1,"dsCMatrix"))
>
> n <- 1e6 ## is already biggish; MM sees top memory up to 1.1 GB;
> ## and it (the bandSparse() call) takes 1 or 2 minutes (slow NB)
> bk <- c(0:5, 7,11)
> bLis <- as.data.frame(matrix(1:8, 1e6, 8, byrow=TRUE))
> B <- bandSparse(1e6, k = bk, list = bLis)
> object.size(B) ## 100 mio Bytes
> Bs <- bandSparse(1e6, k = bk, list = bLis, symmetric=TRUE)
> object.size(Bs)## 100 mio Bytes
> gc()
> show(Bs[1:30, 1:15])
> }
>
>
>
>
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-help
mailing list