[R] Recursion is slow
William Dunlap
wdunlap at tibco.com
Thu Sep 10 00:57:46 CEST 2009
Note that the memoization optimization is orthogonal
to the simpler optimizations I suggested (don't repeat
calculations in loops, sort the input vector, do a little
math to avoid so many recursions to the leaves). You
can combine them to get
Cmemo1 <- function(T, m) {
C <- function(lt, m) {
if (lt == 1L) {
R <- as.numeric(0 <= m & m <= T[1])
} else if (lt == 2L) {
mu <- m - (0:T[2L])
R <- sum(mu <= T[1L] & 0L <= mu)
} else {
R <- 0
lt1 <- lt-1L
for (mu in m-(0:T[lt])) {
if (is.na(memo[lt1, offset + mu]))
memo[lt-1L, offset + mu] <<- C(lt1, mu)
R <- R + memo[lt1, offset + mu]
}
}
R
}
T <- rev(sort(T))
m <- as.integer(m)
offset <- sum(T[-1]) - m + 1L
nrow <- length(T) - 1L
memo <- matrix(rep(NA_real_, nrow * (offset + m)), nrow=nrow)
C(length(T), m)
}
> system.time(z1<-Cmemo1(c(10,30,20,45,3,100,21,37,50,20,37,46,90),409))
user system elapsed
0.44 0.00 0.37
> system.time(z<-Cmemo(c(10,30,20,45,3,100,21,37,50,20,37,46,90),409))
user system elapsed
1.24 0.01 1.19
> identical(z,z1)
[1] TRUE
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
> -----Original Message-----
> From: Bryan Keller [mailto:bskeller at wisc.edu]
> Sent: Wednesday, September 09, 2009 3:25 PM
> To: Martin Morgan
> Cc: William Dunlap; r-help at r-project.org
> Subject: Re: [R] Recursion is slow
>
> Thanks Martin! It seems to be right on and it is blazing
> fast! I greatly appreciate the responses from you and Bill
> both. As a beginning user of R it is really helpful to be
> able to compare my code with yours and try to figure out your tricks.
>
> Bryan
>
> -------------
> Bryan Keller, Doctoral Student/Project Assistant
> Educational Psychology - Quantitative Methods
> The University of Wisconsin - Madison
>
> ----- Original Message -----
> From: Martin Morgan <mtmorgan at fhcrc.org>
> Date: Wednesday, September 9, 2009 4:40 pm
> Subject: Re: [R] Recursion is slow
> To: Bryan Keller <bskeller at wisc.edu>
> Cc: William Dunlap <wdunlap at tibco.com>, r-help at r-project.org
>
>
> > Hi Bryan --
> >
> > Bryan Keller wrote:
> > > Bill,
> > >
> > > Thanks for the great tips for speeding up functions in your
> > response. Those are really useful for me. Even with the
> improvements
> > the recursion is still many times slower than I need it to
> be in order
> > to make it useful in R. I may just have to suck it up and call a
> > compiled language.
> >
> > Probably you've moved on, but it struck me that identical
> values of A()
> > are being calculated repeatedly, so one can perhaps 'memoize' them
> > (record that they've already been calculated). I'm not sure
> that I have
> > this exactly right, or that I've memoized in the right
> place, but this
> >
> > Cmemo <- function(T, m) {
> > C <- function(lt, m) {
> > if (lt == 1L) {
> > R <- as.numeric(0 <= m & m <= T[1])
> > } else {
> > R <- 0
> > for (u in 0:T[lt]) {
> > if (is.na(memo[lt-1L, offset + m-u]))
> > memo[lt-1L, offset + m-u] <<- C(lt-1L, m-u)
> > R <- R + memo[lt-1L, offset + m-u]
> > }
> > }
> > R
> > }
> > offset <- sum(T[-1]) - m + 1L
> > nrow <- length(T) - 1L
> > memo <- matrix(rep(NA_real_, nrow * (offset + m)), nrow=nrow)
> > C(length(T), m)
> > }
> >
> > seems to give consistent answers quickly (A1 is a tidied
> version of your
> > original recursion, B is Bill Dunlap's)
> >
> > > system.time(ansA1 <- A1( c(20,40,35,21,31), 100))
> > user system elapsed
> > 12.144 0.008 12.202
> > > system.time(ansB <- B( c(20,40,35,21,31), 100))
> > user system elapsed
> > 0.272 0.000 0.270
> > > system.time(ansC <- Cmemo( c(20,40,35,21,31), 100))
> > user system elapsed
> > 0.064 0.000 0.066
> > > all.equal(ansA1, ansB)
> > [1] TRUE
> > > all.equal(ansA1, ansC)
> > [1] TRUE
> >
> > It's a little memory inefficient, but seems to scale fairly well.
> >
> > Martin
> >
> > >
> > > Bryan
> > >
> > > -------------
> > > Bryan Keller, Doctoral Student/Project Assistant
> > > Educational Psychology - Quantitative Methods
> > > The University of Wisconsin - Madison
> > >
> > > ----- Original Message -----
> > > From: William Dunlap <wdunlap at tibco.com>
> > > Date: Thursday, September 3, 2009 6:41 pm
> > > Subject: RE: [R] Recursion is slow
> > > To: Bryan Keller <bskeller at wisc.edu>, r-help at r-project.org
> > >
> > >> Bill Dunlap
> > >> TIBCO Software Inc - Spotfire Division
> > >> wdunlap tibco.com
> > >>
> > >>> -----Original Message-----
> > >>> From: r-help-bounces at r-project.org
> > >>> [mailto:r-help-bounces at r-project.org] On Behalf Of Bryan Keller
> > >>> Sent: Thursday, September 03, 2009 2:11 PM
> > >>> To: r-help at r-project.org
> > >>> Subject: [R] Recursion is slow
> > >>>
> > >>> The following recursion is about 120 times faster in C#. I
> > >>> know R is not known for its speed with recursions but I'm
> > >>> wondering if anyone has a tip about how to speed things up in R.
> > >>>
> > >>> #"T" is a vector and "m" is a number between 1 and sum(T)
> > >>>
> > >>> A <- function(T,m) {
> > >>> lt <- length(T)
> > >>>
> > >>> if (lt == 1) {
> > >>> if (0 <= m & m <= T[1]) {
> > >>> return(1)
> > >>> } else {
> > >>> return(0)
> > >>> }
> > >>> }
> > >>> R <- 0
> > >>> for (u in 0:T[lt]) {
> > >>> R <- (R+(A(T[1:(lt-1)],(m-u))))
> > >>> }
> > >>> return(R)
> > >>> }
> > >>>
> > >> For starters, remove all repeated calculations from
> > >> the for loop. Then vectorize the length(T)==2 case
> > >> to avoid a lot of calls to the scalar case. Finally,
> > >> I noticed that the answer was independent of the
> > >> order of T and that putting it in reverse sorted order
> > >> was the faster order, so I did that. I use default
> > >> values of arguments so you don't have to supply
> > >> derived values when you start but the recursions
> > >> don't have to recompute some things.
> > >>
> > >> B <- function(T,m,lt=length(T), Tsorted = rev(sort(T))) {
> > >> if (lt == 1L) {
> > >> R <- as.integer(m <= Tsorted && 0L <= m)
> > >> } else if (lt == 2L) {
> > >> mu <- m - (0:Tsorted[2L])
> > >> R <- sum(mu <= Tsorted[1L] & 0L <= mu)
> > >> } else {
> > >> R <- 0L
> > >> lt1 <- lt-1L
> > >> T1 <- Tsorted[1:lt1]
> > >> for (mu in m-(0:Tsorted[lt])) {
> > >> R <- R + B(unused, mu, lt1, T1)
> > >> }
> > >> }
> > >> R
> > >> }
> > >>
> > >> E.g.,
> > >>
> > >>> system.time(A( c(20,40,35,21,31), 100))
> > >> user system elapsed
> > >> 13.23 0.08 12.93
> > >>> system.time(B( c(20,40,35,21,31), 100))
> > >> user system elapsed
> > >> 0.35 0.00 0.33
> > >>> A( c(20,40,35,21,31), 100)
> > >> [1] 193363
> > >>> B( c(20,40,35,21,31), 100)
> > >> [1] 193363
> > >>
> > >> Bill Dunlap
> > >> TIBCO Software Inc - Spotfire Division
> > >> wdunlap tibco.com
> > >>
> > >>> -------------
> > >>> Bryan Keller, Doctoral Student/Project Assistant
> > >>> Educational Psychology - Quantitative Methods
> > >>> The University of Wisconsin - Madison
> > >>>
> > >>> ______________________________________________
> > >>> R-help at r-project.org mailing list
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >>> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained,
> reproducible code.
> > >>>
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
>
More information about the R-help
mailing list