[Rd] Subsetting "dspMatrix" without coercion to "matrix"
Mikael Jagan
j@g@nmn2 @end|ng |rom gm@||@com
Wed Nov 17 06:13:57 CET 2021
Currently, the class for dense symmetric matrices in packed storage,
"dspMatrix", inherits its subset (i.e., `[`) methods from the "Matrix"
class. As a result, subsetting "dspMatrix" requires coercion to
"matrix". If memory cannot be allocated for this "matrix", then an error
results.
n <- 30000L
x <- as.double(seq_len(choose(n + 1L, 2L)))
S <- new("dspMatrix", uplo = "L", x = x, Dim = c(n, n))
S[, 1L]
# Error: vector memory exhausted (limit reached?)
traceback()
# 10: .dsy2mat(dsp2dsy(from))
# 9: asMethod(object)
# 8: as(x, "matrix")
# 7: x[, j = j, drop = TRUE]
# 6: x[, j = j, drop = TRUE]
# 5: eval(call, parent.frame())
# 4: eval(call, parent.frame())
# 3: callGeneric(x, , j = j, drop = TRUE)
# 2: S[, 1L]
# 1: S[, 1L]
This seems entirely avoidable, given that there is a relatively simple
formula for converting 2-ary indices [i,j] of S to 1-ary indices k of
S[lower.tri(S, TRUE)]:
k <- i + round(0.5 * (2L * n - j) * (j - 1L)) # for i >= j
Has the implementation of `[` for class "dspMatrix" been discussed
already elsewhere? If not, shall I report it to Bugzilla as a wishlist item?
FWIW, I encountered this problem while trying to coerce a large "dist"
object to "dspMatrix", expecting that I would be able to safely subset
the result:
setAs(from = "dist", to = "dspMatrix", function(from) {
p <- length(from)
if (p > 0L) {
n <- as.integer(round(0.5 * (1 + sqrt(1 + 8 * p)))) # p = n*(n-1)/2
x <- double(p + n)
i <- 1L + c(0L, cumsum(n:2L))
x[-i] <- from
} else {
n <- 1L
x <- 0
}
new("dspMatrix", uplo = "L", x = x, Dim = c(n, n))
})
But that attempt failed for a entirely different reason. For large
enough n, I would hit a memory limit at the line:
x[-i] <- from
So I guess my second question is: what is problematic about this
benign-seeming subset-assignment?
Mikael
More information about the R-devel
mailing list