[R] Matrix package: band matrix
Martin Maechler
maechler at stat.math.ethz.ch
Sat Feb 21 00:51:38 CET 2009
>>>>> "KK" == Ken Knoblauch <ken.knoblauch at inserm.fr>
>>>>> on Fri, 20 Feb 2009 14:11:55 +0000 (UTC) writes:
KK> Thomas Lumley <tlumley <at> u.washington.edu> writes:
>> I want to construct a symmetric band matrix in the Matrix
>> package from a matrix where the first column contains
>> data for the main diagonal, the second column has data
>> for the first subdiagonal/superdiagonal and so on.
>>
>> Since the Matrix will be 10^5 x 10^5 or so, with perhaps
>> 10-20 non-zero elements above the diagonal per row, I
>> can't do it by constructing a full matrix and then using
>> the band() function to subset it to a band matrix.
>>
>> Any suggestions?
>>
>> -thomas
>>
>> Thomas Lumley Assoc. Professor, Biostatistics tlumley
>> <at> u.washington.edu University of Washington, Seattle
>>
KK> I'm not expert but just looking at the package, perhaps
KK> the spMatrix function (constructor) would do it.
KK> spMatrix(5, 5, c(2:5, 1:5, 1:4), c(1:4, 1:5, 2:5), rnorm(13))
KK> 5 x 5 sparse Matrix of class "dgTMatrix"
KK> [1,] -1.7109379 -0.5576860 . . .
KK> [2,] -0.3219899 -0.9294558 0.001323253 . .
KK> [3,] . 0.4099070 0.646068453 0.5388224 .
KK> [4,] . . -1.007848357 -0.1117796 -0.5555834
KK> [5,] . . . -0.6816979 -0.6052127
Yes, very good, Ken!
My solution uses the successor of spMatrix(), called
sparseMatrix(), unless in the symmetric case, where it class
new("dsTMatrix", ...) directly:
bandSparse <- function(n, m = n, k, listDiag, symmetric = FALSE)
{
## Purpose: Compute a band-matrix by speciyfying its (sub-)diagonal(s)
## ----------------------------------------------------------------------
## Arguments: (n,m) : Matrix dimension
## k : integer vector of "diagonal numbers", with identical
## meaning as in band(*, k)
## listDiag: list of (sub)diagonals
## symmetric: if TRUE, specify only upper or lower triangle;
## ----------------------------------------------------------------------
## Author: Martin Maechler, Date: 20 Feb 2009, 22:42
use.x <- !missing(listDiag)
len.k <- length(k)
stopifnot(is.list(listDiag),
k == as.integer(k), n == as.integer(n), m == as.integer(m))
if(use.x && length(listDiag) != len.k)
stop(sprintf("'listDiag' must have the same length (%d) as 'k'", len.k))
k <- as.integer(k)
n <- as.integer(n)
m <- as.integer(m)
stopifnot(n >= 0, m >= 0, -n+1 <= k, k <= m - 1)
if(symmetric && any(k < 0) && any(k > 0))
stop("for symmetric band matrix, only specify upper or lower triangle",
"\n hence, all k must have the same sign")
dims <- c(n,m)
k.lengths <- ## This is a bit "ugly"; I got the cases "by inspection"
if(n >= m) {
ifelse(k >= m-n, m - pmax(0,k), n+k)
} else { ## n < m
ifelse(k >= -n+1, n + pmin(0,k), m-k)
}
i <- j <- integer(sum(k.lengths))
if(use.x)
x <- if(len.k > 0) # carefully getting correct type/mode
rep.int(listDiag[[1]][1], length(i))
off.i <- 0L
for(s in seq_len(len.k)) {
kk <- k[s] ## *is* integer
l.kk <- k.lengths[s] ## == length of (sub-)diagonal kk
ii1 <- seq_len(l.kk)
ind <- ii1 + off.i
if(kk >= 0) {
i[ind] <- ii1
j[ind] <- ii1 + kk
} else { ## k < 0
i[ind] <- ii1 - kk
j[ind] <- ii1
}
if(use.x) {
xx <- listDiag[[s]]
if(length(xx) < l.kk)
warning(sprintf("the %d-th (sub)-diagonal (k = %d) is %s",
s, kk, "too short; filling with NA's"))
x[ind] <- xx[ii1]
}
off.i <- off.i + l.kk
}
if(symmetric) { ## we should have smarter sparseMatrix()
UpLo <- if(min(k) >= 0) "U" else "L"
as(if(use.x) {
if(is.integer(x)) x <- as.double(x)
new("dsTMatrix", i= i-1L, j = j-1L, x = x, Dim= dims, uplo=UpLo) } else
new("nsTMatrix", i= i-1L, j = j-1L, Dim= dims, uplo=UpLo),
"CsparseMatrix")
}
else { ## general, not symmetric
if(use.x)
sparseMatrix(i=i, j=j, x=x, dims=dims)
else
sparseMatrix(i=i, j=j, dims=dims)
}
}
if(FALSE) { ## Examples
dList <- list(1:30, 10*(1:20), 100*(1:20))
s1 <- bandSparse(13, k = -c(0:2, 6), lis = c(dList, dList[2]), symm=TRUE)
s2 <- bandSparse(13, k = c(0:2, 6), lis = c(dList, dList[2]), symm=TRUE)
stopifnot(identical(s1, t(s2)), is(s1,"dsCMatrix"))
n <- 1e6 ## is already biggish; MM sees top memory up to 1.1 GB;
## and it (the bandSparse() call) takes 1 or 2 minutes (slow NB)
bk <- c(0:5, 7,11)
bLis <- as.data.frame(matrix(1:8, 1e6, 8, byrow=TRUE))
B <- bandSparse(1e6, k = bk, list = bLis)
object.size(B) ## 100 mio Bytes
Bs <- bandSparse(1e6, k = bk, list = bLis, symmetric=TRUE)
object.size(Bs)## 100 mio Bytes
gc()
show(Bs[1:30, 1:15])
}
More information about the R-help
mailing list