[Rd] Why is the diag function so slow (for extraction)?

Thu May 7 17:49:49 CEST 2015

Is it possible to replace c() with .subset()? Example below

####
####

library(microbenchmark)

diag2 <- function(x,nrow, ncol){
  if (is.matrix(x)) {
    if (nargs() > 1L)
      stop("'nrow' or 'ncol' cannot be specified when 'x' is a matrix")
    if ((m <- min(dim(x))) == 0L)
      return(vector(typeof(x), 0L))
    # replace this part
    y <- .subset(x,1 + 0L:(m - 1L) * (dim(x)[1L] + 1))
    nms <- dimnames(x)
    if (is.list(nms) && !any(sapply(nms, is.null)) && identical((nm <-
nms[[1L]][seq_len(m)]),

nms[[2L]][seq_len(m)]))
      names(y) <- nm
    return(y)
  }
  if (is.array(x) && length(dim(x)) != 1L)
    stop("'x' is an array, but not one-dimensional.")
  if (missing(x))
    n <- nrow
  else if (length(x) == 1L && nargs() == 1L) {
    n <- as.integer(x)
    x <- 1
  }
  else n <- length(x)
  if (!missing(nrow))
    n <- nrow
  if (missing(ncol))
    ncol <- n
}

nc  <- 10

set.seed(1)
m <- matrix(sample(letters,nc^2,replace=TRUE), ncol = nc)

runoff <- microbenchmark(
  diaga = diag(m),
  diagb = diag2(m)
)

Regards,

Steve Bronder
Website: stevebronder.com
Phone: 412-719-1282
Email: sbronder at stevebronder.com

On Tue, May 5, 2015 at 9:46 AM, <luke-tierney at uiowa.edu> wrote:

> Looks like the c(x)[...] bit used to be as.matrix(x)[...]. Not sure
> why the change was made many years ago, but this was before names were
> handled explicitly. It would definitely be better to not force the
> duplicate, at least in the case where we are sure c() and [ would not
> dispatch.
>
> Best,
>
> luke
>
> On Mon, 4 May 2015, peter dalgaard wrote:
>
>
>>  On 04 May 2015, at 19:59 , franknarf <by.hook.or at gmail.com> wrote:
>>>
>>> But I'm still wondering why diag() uses c()...? With it being so slow,
>>> I'd
>>> be inclined to write a qdiag() without the c() and just use that the next
>>> time I need matrix algebra. Any insight would be appreciated; thanks!
>>>
>>
>> Well, there are two possibilities: Either it is deliberate or it isn't.
>>
>> The latter isn't too unlikely, given that the effect is seen for large
>> matrices. I would appear to be a matter of O(n) (picking out n items) vs.
>> O(n^2) (copying an n x n matrix), but this might drown out in a context
>> involving matrix multiplication and/or inversion, both of which are O(n^3).
>>
>> If it is deliberate, the question is why. There could be devils in the
>> details; notice in particular that c() strips off non-name attributes.
>> However, I'm not aware of a situation where such attributes could cause
>> trouble.
>>
>> -pd
>>
>>
>>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa                  Phone:             319-335-3386
> Department of Statistics and        Fax:               319-335-3017
>    Actuarial Science
> 241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
> Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu
>
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

	[[alternative HTML version deleted]]