[R] Recursion is slow

Martin Morgan mtmorgan at fhcrc.org
Wed Sep 9 23:40:24 CEST 2009


Hi Bryan --

Bryan Keller wrote:
> Bill,
> 
> Thanks for the great tips for speeding up functions in your response.  Those are really useful for me.  Even with the improvements the recursion is still many times slower than I need it to be in order to make it useful in R.  I may just have to suck it up and call a compiled language.

Probably you've moved on, but it struck me that identical values of A()
are being calculated repeatedly, so one can perhaps 'memoize' them
(record that they've already been calculated). I'm not sure that I have
this exactly right, or that I've memoized in the right place, but this

Cmemo <- function(T, m) {
    C <- function(lt, m) {
        if (lt == 1L) {
            R <- as.numeric(0 <= m & m <= T[1])
        } else {
            R <- 0
            for (u in 0:T[lt]) {
                if (is.na(memo[lt-1L, offset + m-u]))
                    memo[lt-1L, offset + m-u] <<- C(lt-1L, m-u)
                R <- R + memo[lt-1L, offset + m-u]
            }
        }
        R
    }
    offset <- sum(T[-1]) - m + 1L
    nrow <- length(T) - 1L
    memo <- matrix(rep(NA_real_, nrow * (offset + m)), nrow=nrow)
    C(length(T), m)
}

seems to give consistent answers quickly (A1 is a tidied version of your
original recursion, B is Bill Dunlap's)

> system.time(ansA1 <- A1( c(20,40,35,21,31), 100))
   user  system elapsed
 12.144   0.008  12.202
> system.time(ansB <- B( c(20,40,35,21,31), 100))
   user  system elapsed
  0.272   0.000   0.270
> system.time(ansC <- Cmemo( c(20,40,35,21,31), 100))
   user  system elapsed
  0.064   0.000   0.066
> all.equal(ansA1, ansB)
[1] TRUE
> all.equal(ansA1, ansC)
[1] TRUE

It's a little memory inefficient, but seems to scale fairly well.

Martin

> 
> Bryan
> 
> -------------
> Bryan Keller, Doctoral Student/Project Assistant
> Educational Psychology - Quantitative Methods
> The University of Wisconsin - Madison
> 
> ----- Original Message -----
> From: William Dunlap <wdunlap at tibco.com>
> Date: Thursday, September 3, 2009 6:41 pm
> Subject: RE: [R] Recursion is slow
> To: Bryan Keller <bskeller at wisc.edu>, r-help at r-project.org
> 
>> Bill Dunlap
>> TIBCO Software Inc - Spotfire Division
>> wdunlap tibco.com  
>>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org 
>>> [mailto:r-help-bounces at r-project.org] On Behalf Of Bryan Keller
>>> Sent: Thursday, September 03, 2009 2:11 PM
>>> To: r-help at r-project.org
>>> Subject: [R] Recursion is slow
>>>
>>> The following recursion is about 120 times faster in C#.  I 
>>> know R is not known for its speed with recursions but I'm 
>>> wondering if anyone has a tip about how to speed things up in R.
>>>
>>> #"T" is a vector and "m" is a number between 1 and sum(T)
>>>
>>> A <- function(T,m) {
>>> lt <- length(T)
>>>
>>> if (lt == 1) {
>>> 	if (0 <= m & m <= T[1]) {
>>>         	return(1)
>>>         	} else {
>>>         	return(0)
>>> 	}
>>> }
>>> R <- 0
>>> for (u in 0:T[lt]) {
>>> 	R <- (R+(A(T[1:(lt-1)],(m-u))))
>>> 	}
>>> return(R)
>>> }
>>>
>> For starters, remove all repeated calculations from
>> the for loop.  Then vectorize the length(T)==2 case
>> to avoid a lot of calls to the scalar case.  Finally,
>> I noticed that the answer was independent of the
>> order of T and that putting it in reverse sorted order
>> was the faster order, so I did that.  I use default
>> values of arguments so you don't have to supply
>> derived values when you start but the recursions
>> don't have to recompute some things.
>>
>> B <- function(T,m,lt=length(T), Tsorted = rev(sort(T))) {
>>    if (lt == 1L) {
>>       R <- as.integer(m <= Tsorted && 0L <= m)
>>    } else if (lt == 2L) {
>>       mu <- m - (0:Tsorted[2L])
>>       R <- sum(mu <= Tsorted[1L] & 0L <= mu)
>>    } else {
>>       R <- 0L
>>       lt1 <- lt-1L
>>       T1 <- Tsorted[1:lt1]
>>       for (mu in m-(0:Tsorted[lt])) {
>>          R <- R + B(unused, mu, lt1, T1)
>>       }
>>    }
>>    R
>> }
>>
>> E.g.,
>>
>>> system.time(A( c(20,40,35,21,31), 100))
>>    user  system elapsed 
>>   13.23    0.08   12.93 
>>> system.time(B( c(20,40,35,21,31), 100))
>>    user  system elapsed 
>>    0.35    0.00    0.33 
>>> A( c(20,40,35,21,31), 100)
>> [1] 193363
>>> B( c(20,40,35,21,31), 100)
>> [1] 193363
>>
>> Bill Dunlap
>> TIBCO Software Inc - Spotfire Division
>> wdunlap tibco.com 
>>
>>> -------------
>>> Bryan Keller, Doctoral Student/Project Assistant
>>> Educational Psychology - Quantitative Methods
>>> The University of Wisconsin - Madison
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide 
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list