[Rd] Why is the diag function so slow (for extraction)?

luke-tierney at uiowa.edu luke-tierney at uiowa.edu
Sat May 16 17:05:43 CEST 2015


Changing from c(x)[...] to x[...] without a pre-test does not seem to
cause any issues on CRAN or BIOC so that is done now in R-devel and
R-patched.

Best,

luke

On Wed, 13 May 2015, Henrik Bengtsson wrote:

> As kindly pointed out to me (oh my decaying gray matter), is.object()
> is better suited for this test;
>
> $ svn diff src/library/base/R/diag.R
> Index: src/library/base/R/diag.R
> ===================================================================
> --- src/library/base/R/diag.R   (revision 68345)
> +++ src/library/base/R/diag.R   (working copy)
> @@ -23,9 +23,11 @@
>             stop("'nrow' or 'ncol' cannot be specified when 'x' is a matrix")
>
>         if((m <- min(dim(x))) == 0L) return(vector(typeof(x), 0L))
> +        nms <- dimnames(x)
> +        nrow <- dim(x)[1L]
>         ## NB: need double index to avoid overflows.
> -        y <- c(x)[1 + 0L:(m - 1L) * (dim(x)[1L] + 1)]
> -        nms <- dimnames(x)
> +        if (is.object(x)) x <- c(x)
>
> /Henrik
>
> On Tue, May 12, 2015 at 8:24 PM, Henrik Bengtsson
> <henrik.bengtsson at ucsf.edu> wrote:
>> Along Luke's lines, would(n't) it be enough to look for existence of
>> attribute 'class' to decide whether to dispatch or not, i.e. if c() is
>> needed or not?  Even without .subset(), there is a remarkable
>> improvement.  I think it's worth condition the code on dispatch or
>> not.  For example:
>>
>> [HB-X201]{hb}: svn diff diag.R
>> Index: diag.R
>> ===================================================================
>> --- diag.R      (revision 68345)
>> +++ diag.R      (working copy)
>> @@ -23,9 +23,11 @@
>>              stop("'nrow' or 'ncol' cannot be specified when 'x' is a matrix")
>>
>>          if((m <- min(dim(x))) == 0L) return(vector(typeof(x), 0L))
>> +        nms <- dimnames(x)
>> +        nrow <- dim(x)[1L]
>>          ## NB: need double index to avoid overflows.
>> -        y <- c(x)[1 + 0L:(m - 1L) * (dim(x)[1L] + 1)]
>> -        nms <- dimnames(x)
>> +        if (!is.null(attr(x, "class"))) x <- c(x)
>> +        y <- x[1 + 0L:(m - 1L) * (nrow + 1)]
>>          if (is.list(nms) && !any(sapply(nms, is.null)) &&
>>              identical((nm <- nms[[1L]][seq_len(m)]), nms[[2L]][seq_len(m)]))
>>              names(y) <- nm
>>
>> ?
>>
>> /Henrik
>>
>> On Tue, May 12, 2015 at 5:33 AM, Martin Maechler
>> <maechler at lynne.stat.math.ethz.ch> wrote:
>>>>>>>> Steve Bronder <sbronder at stevebronder.com>
>>>>>>>>     on Thu, 7 May 2015 11:49:49 -0400 writes:
>>>
>>>    > Is it possible to replace c() with .subset()?
>>>
>>> It would be possible, but I think "entirely" wrong.
>>>
>>> .subset() is documented to be an internal function not to be
>>> used "lightly" and more to the point it is documented to *NOT*
>>> dispatch at all.
>>>
>>> If you read and understood what Peter and Luke wrote, you'd not
>>> special case here:
>>>
>>> diag() should not work only for pure matrices, but for all
>>> "matrix-like" objects for which ``the usual methods'' work, such
>>> as
>>>    as.vector(.), c(.)
>>>
>>> That's why there has been the c(.) in there.
>>>
>>> You can always make code faster if you write the code so it only
>>> has to work in one special case .. and work there very fast.
>>>
>>>
>>>    > Example below
>>>    > ####
>>>    > ####
>>>
>>>    > library(microbenchmark)
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

-- 
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-devel mailing list