[R] Alternatives for explicit for() loops

marammagdysalem at gmail.com marammagdysalem at gmail.com
Sun Nov 15 18:18:40 CET 2015


Thanks a lot Jim and Boris for replying.

Sent from my iPhone

> On Nov 9, 2015, at 1:13 AM, jim holtman <jholtman at gmail.com> wrote:
> 
> You need to take a close look at the function incomb that you are creating.  I see what appears to be a constant value ("*(gamma((1/beta)+1))*((alpha)^(-(1/beta)))") being computed that you might only have to compute once before the function.  You are also referencing many variables (m, LED, j, ...) that are not being passed in that are in the global environment; it is probably better to pass them in to the function.  I am not sure what 'pbapply' is doing for you since I see this is new to the code that you first sent out.
> 
> I would be good it you told us what the function is trying to do; you are showing us how you want to do it, not what you want to do.  Are there other ways of doing it?  If speed is your problem, then consider the "Rcpp" package and write the function is C++ which might be faster, but again, take a look at what you are doing to see if there are other ways.  I don't have time to dig into the code, since there is a lack of comments, to understand why you are using, e.g., 'choose', 'prod', etc.).  There are probably a lot of ways of speeding up the code, if could tell us what you want to accomplish.
> 
> 
> Jim Holtman
> Data Munger Guru
>  
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
> 
>> On Sun, Nov 8, 2015 at 4:48 AM, Maram SAlem <marammagdysalem at gmail.com> wrote:
>> Thanks all for replying.
>> 
>> In fact I've used the the Rprof() function and found out that the incomb() function (in my code above)  takes about 80% of the time, but I didn't figure out which part of the function is causing the delay. So I thought that this may be due to the for() loops. 
>> I MUST run this code for rather large values of n and m, so is there any way that can help me do that without having to wait for more than three days to reach an output. N.B. I'll have to repeat these runs for may be 70 or 80 times , and this means HUGE time
>> 
>> I'd appreciate any sort of help.
>> Thanks in advance.
>> 
>> Maram Salem
>> 
>>> On 6 November 2015 at 16:54, jim holtman <jholtman at gmail.com> wrote:
>>> If you have code that is running for a long time, then take a small case that only runs for 5-10 minutes and turn on the RProfiler so that you can see where you are spending your time.  In most cases, it is probably not the 'for' loops that are causing the problem, but some function/calculation you are doing within the loop that is consuming the time, and until you determine what section of code that is, is it hard to tell exactly what the problem is, much less the solution.
>>> 
>>> 
>>> Jim Holtman
>>> Data Munger Guru
>>>  
>>> What is the problem that you are trying to solve?
>>> Tell me what you want to do, not how you want to do it.
>>> 
>>>> On Wed, Nov 4, 2015 at 9:09 AM, Maram SAlem <marammagdysalem at gmail.com> wrote:
>>>> Hi Jim,
>>>> 
>>>> Thanks a lot for replying.
>>>> 
>>>> In fact I'm trying to run a simulation study that enables me to calculate the Bayes risk of a sampling plan selected from progressively type-II censored Weibull model. One of the steps involves evaluating the expected test time, which is a rather complicated formula that involves nested multiple summations where the counters of the summation signs are dependent, that's why I thought of I should create the incomb() function inside the loop, or may be I didn't figure out how to relate its arguments to the ones inside the loop had I created it outside it.  I'm trying to create a matrix of all the possible combinations involved in the summations and then use the apply() function on each row of that matrix. The problem is that the code I wrote works perfectly well for rather small values of the sample size,n, and the censoring number, m (for example, n=8,m=4),but when n and m are increased (say, n=25,m=15) the code keeps on running for days with no output. That's why I thought I should try to avoid explicit loops as much as possible, so I did my best in this regard but still the code takes too long to execute,(more than three days), thus, i believe there must be something wrong.
>>>> 
>>>> Here's the full code:
>>>> 
>>>> library(pbapply)
>>>> f1 <- function(n, m) {
>>>>    stopifnot(n > m)
>>>>    r0 <- t(diff(combn(n-1, m-1)) - 1L)
>>>>    r1 <- rep(seq(from=0, len=n-m+1), choose( seq(to=m-2, by=-1, len=n-m+1), m-2))
>>>>    cbind(r0[, ncol(r0):1, drop=FALSE], r1, deparse.level=0)
>>>> }
>>>> simpfun<- function (x,n,m,p,alpha,beta)
>>>>   {
>>>>   a<-factorial(n-m)/(prod((factorial(x)))*(factorial((n-m)- sum(x))))
>>>>   b <-  ((m-1):1)
>>>>   c<- a*((p)^(sum(x)))*((1-p)^(((m-1)*(n-m))- sum(x%*%(as.matrix(b)))))
>>>> d <- n - cumsum(x) - (1:(m-1))
>>>>   e<- n*(prod(d))*c
>>>> LD<-list()
>>>>    for (i in 1:(m-1))  {
>>>>    LD[[i]]<-seq(0,x[i],1)
>>>>    }
>>>>    LD[[m]]<-seq(0,(n-m-sum(x)),1)
>>>>    LED<-expand.grid (LD)
>>>>    LED<-as.matrix(LED)
>>>>    store1<-numeric(nrow(LED))
>>>> for (j in 1:length(store1) )
>>>>          {
>>>>             incomb<-function(x,alpha,beta) {
>>>>              g<-((-1)^(sum(LED[j,])))*(gamma((1/beta)+1))*((alpha)^(-(1/beta)))
>>>>                     h <- choose(x, LED[j,-m])    
>>>>                    ik<-prod(h)*choose((n-m-sum(x)),LED[j,m]) 
>>>>                 lm<-cumsum(LED[j,-m]) + (1:(m-1))  
>>>>                 plm<-prod(lm)
>>>>                gil<-g*ik/(plm)
>>>>              hlm<-numeric(sum(LED[j,])+(m-1))
>>>>              dsa<-length(hlm)
>>>>               for (i in 1:dsa)  
>>>>                 {
>>>>                  ppp<- sum(LED[j,])+(m-1)
>>>>                   hlm[i]<-  (choose(ppp,i))*((-1)^(i))*((i+1)^((-1)*((1/beta)+1)))  
>>>>                  }
>>>>           shl<-gil*(sum(hlm)+1)         
>>>>           return (shl)
>>>>           }
>>>>        store1[j]<-incomb(x,alpha=0.2,beta=2)
>>>>       }
>>>> val1<- sum(store1)*e
>>>> return(val1)
>>>> }
>>>> 
>>>> va<-pbapply(s,1,simpfun,n=6,m=4,p=0.3,alpha=0.2,beta=2)
>>>> EXP<-sum(va)
>>>> 
>>>> 
>>>> 
>>>> Any help would be greatly appreciated.
>>>> Thanks a lot  for your time.
>>>> 
>>>> Best Regards, 
>>>> Maram Salem
>>>> 
>>>> 
>>>>> On 2 November 2015 at 00:27, jim holtman <jholtman at gmail.com> wrote:
>>>>> Why are you recreating the incomb function within the loop instead of defining it outside the loop?  Also you are referencing several variables that are global (e.g., m & j); you should be passing these in as parameters to the function.
>>>>> 
>>>>> 
>>>>> Jim Holtman
>>>>> Data Munger Guru
>>>>>  
>>>>> What is the problem that you are trying to solve?
>>>>> Tell me what you want to do, not how you want to do it.
>>>>> 
>>>>>> On Sun, Nov 1, 2015 at 7:31 AM, Maram SAlem <marammagdysalem at gmail.com> wrote:
>>>>>> Hi All,
>>>>>> 
>>>>>> I'm writing a long code that takes long time to execute. So I used the
>>>>>> Rprof() function and found out that the function that takes about 80% of
>>>>>> the time is the incomb () fucntion (below), and this is most probably
>>>>>> because of the many explicit for() loops I'm using.
>>>>>> 
>>>>>> n=18;m=4;p=0.3;alpha=0.2;beta=2
>>>>>> x=c(3,0,0)
>>>>>> LD<-list()
>>>>>>    for (i in 1:(m-1))  {
>>>>>>    LD[[i]]<-seq(0,x[i],1)
>>>>>>    }
>>>>>>    LD[[m]]<-seq(0,(n-m-sum(x)),1)
>>>>>>    LED<-expand.grid (LD)
>>>>>>    LED<-as.matrix(LED)
>>>>>>    store1<-numeric(nrow(LED))
>>>>>>     h<- numeric(m-1)
>>>>>>     lm<- numeric(m-1)
>>>>>>      for (j in 1:length(store1) )
>>>>>>          {
>>>>>>             incomb<-function(x,alpha,beta) {
>>>>>> 
>>>>>>  g<-((-1)^(sum(LED[j,])))*(gamma((1/beta)+1))*((alpha)^(-(1/beta)))
>>>>>>                   for (i in 1:(m-1))  {
>>>>>>                        h[i]<- choose(x[i],LED[j,i])
>>>>>>                        }
>>>>>>                  ik<-prod(h)*choose((n-m-sum(x)),LED[j,m])
>>>>>>                 for (i in 1:(m-1)) {
>>>>>>                        lm[i]<-(sum(LED[j,1:i])) + i
>>>>>>                      }
>>>>>>                 plm<-prod(lm)
>>>>>>                gil<-g*ik/(plm)
>>>>>>              hlm<-numeric(sum(LED[j,])+(m-1))
>>>>>>              dsa<-length(hlm)
>>>>>>               for (i in 1:dsa)
>>>>>>                 {
>>>>>>                  ppp<- sum(LED[j,])+(m-1)
>>>>>>                   hlm[i]<-
>>>>>>  (choose(ppp,i))*((-1)^(i))*((i+1)^((-1)*((1/beta)+1)))
>>>>>>                  }
>>>>>>           shl<-gil*(sum(hlm)+1)
>>>>>>           return (shl)
>>>>>>           }
>>>>>>        store1[j]<-incomb(x,alpha=0.2,beta=2)
>>>>>>       }
>>>>>> 
>>>>>> 
>>>>>> I'm trying to use alternatives (for ex. to vectorize things) to the
>>>>>> explicit for() loops, but things don't work out.
>>>>>> 
>>>>>> Any suggestions that can help me to speed up the execution of the incomb()
>>>>>> function are much appreciated.
>>>>>> 
>>>>>> Thanks a lot in advance.
>>>>>> 
>>>>>> Maram Salem
>>>>>> 
>>>>>>         [[alternative HTML version deleted]]
>>>>>> 
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
> 

	[[alternative HTML version deleted]]



More information about the R-help mailing list