[R] Antwort: Re: an alternative to R for nonlinear stat models

Wed Jun 16 20:33:34 CEST 2010

   Hi

   I  think  my first answer doesn't seem to have gone through due to the
   attachement. So I copy pasted my last answer including the code below.

   Sorry about that:

   Hi Paul

   Oh ok sorry, I send you below a copy of the R code I was using. It's very
   possible that my programming is slow as well. I'm very happy to learn.

   Sorry the code is a bit long, but its structured like this and cotains three
   main functions and the optimization:

   1. Input - (I didn't send the data)

   2.  Function  to  calculate the cell probabilities for the multinomial
   likelihood

   3. Function to calculate the multinomial

   4.Function to be optimized

   5. then I set the starting values

   6. optimization using mle2

   hope that helps

   cheers

   Beni

   ###########################################################################
   ###########################################################################

   #library(boot)
   #library(bbmle)
   #library(demogR)

   ##Simulated harvest numbers of a cohort harvested over 30 years (including
   random noise)
   pop<-N.true       # True population size
   cohort<-C       # harvest matrix
   R<-nrow(cohort)
   L<-ncol(cohort)

   h<-matrix(rep(Q,R),ncol=a,byrow=T)  # estimated harvest rates
   S<-matrix(rep(P[-1],R),ncol=a-1,byrow=T)  # survival rates

   ##############################################
   ##Function to calculate the cell probabilities
   ##############################################

   predprob <- function(S=S,h=h) {
     R<-nrow(cohort)
     L<-ncol(cohort)
     p <- matrix(rep(0,(R)*(L)),ncol=L)  # matrix of cell probabilities
      p[R,1]<-h[R,1]
    p[1,L]<-h[1,L]
       ay<-rep(0,L+R+1)     #   vector  of  last  cell  probabilities  ->
   1-sum(rest));Gove(2002)
      ay[L+R]<-1-h[1,L]
    ay[2]<-1-h[R,1]
        index<-3       #   ay[1]=Null,   ay[2]=1-h[R,1],ay[L+R]=1-h[1,L],
   ay[L+R+1]=Null

   for (i in -(R-2):(L-2)){    # Calculating the cell prob after Gove (2002)

    p[row(p) == col(p) - i] <-
   odiag(h,i)*c(1,cumprod((1-odiag(h[1:(R-1),1:(L-1)],i))
         *odiag(S[1:(R-1),],i)))

    ay[index]<-1-sum(odiag(p[1:R,],i))

    index<-index+1
      }

    p<-rbind(p,ay[1:L])
    p<-cbind(p,ay[length(ay):(L+1)])
   return(p)

    }

   predprob(S,h)      #test the function

   ####################################
   ##Function to calculate the multinomial ->from Ben Bolker
   ####################################

   ## multinom WITHOUT ROUNDING & without error-checking
   dmnom <- function(x,prob,log=FALSE) {
     r <- lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1))
     if (log) r else exp(r)
   }

   ###########################
   ## function to be optimized
   ###########################

   mfun <- function(logN,S=S,h=h,cohort=cohort) {
     L<-ncol(cohort)
     R<-nrow(cohort)
     d<-matrix(rep(NA,(R+1)*(L+1)),ncol=L+1)  # last row and last col of the
   matrix
      d[1:R,1:L]<-log(cohort)    #  are  the  values to be optimized->Log
   tranformed data!

     index<-1
     index1<-(R+L)

   for (i in -(R-1):(L-1)){    # values for the top row
   if (i<= -(R-L) ){
   d[(R+1),(index+1)]<-log(exp(logN[index])-sum(odiag(cohort,i)))} else
   if (i> -(R-L)& i<1 ){
   d[index1,(L+1)]<-log(exp(logN[index])-sum(odiag(cohort,i)))} else
   if (i>0){
   d[index1,(L+1)]<-log(exp(logN[index])-sum(odiag(cohort,i)))}

   index<-index+1
   index1<-index1-1
   }

   pp<-predprob(S,h)

   lhood<-0       # The Likelihood function -> to be maximized

   for (i in -(R-1):(L-1)){
    lhood<-lhood+(-dmnom(exp(odiag(d,i)),prob=odiag(pp,i),log=TRUE))
   }
   return(lhood)
   }

   logN<-log(rep(1000,(R+L-1)))     # Test the function with test-values
   mfun(logN,S,h,cohort)

   ############################################################################
   ##############
   ############################################################################
   ##############

   ##########
   ##Starting values for the optimization
   ##########

   N<-rep(500,(R+L-1))     # optimization,->N-x1-x2-x3
   svec = log(N)       # transform to avoid constraints (x>0)
   names(svec) <- paste("logN",1:(R+L-1),sep="")
   parnames(mfun) <- names(svec)

   ##########
   ##Lower bounds for the L-BFGS-B-method
   ##########

   index<-1
   lower<-rep(0,(R+L-1))

   for (i in -(R-1):(L-1)){
   lower[index]<-log(sum(odiag(cohort,i)))
   index<-index+1
   }

   lower<-lower+0.001
   names(lower)<-names(svec)
   exp(lower)       # check lower bounds
   #########################
   ##Optimization with mle2
   #########################
   ## use bounded optimization

   m1 = mle2(mfun,start=svec,
     method="L-BFGS-B",lower=lower,upper=10,  # lower=must be larger than
   "sum(odiag(x))"<-if smaller NANs are produced
     vecpar=TRUE,
     data=list(S=S,h=h,cohort=cohort))

   coef(m1)
   summary(m1)
   Nest<-exp(coef(m1))     # estimated cohort sizes for row and colum 1

   -----Paul Hiemstra <p.hiemstra at geo.uu.nl> schrieb: -----

     An: benedikt.gehr at ieu.uzh.ch
     Von: Paul Hiemstra <p.hiemstra at geo.uu.nl>
     Datum: 16.06.2010 09:36
     Kopie: r-help at r-project.org, davef at otter-rsch.com
     Betreff: Re: [R] an alternative to R for nonlinear stat models
     On 06/16/2010 07:35 AM, benedikt.gehr at ieu.uzh.ch wrote:
     >     Hi
     >     I implemented the age-structure model in Gove et al (2002) in R,
     which is a
     >     nonlinear statistical model. However running the model in R was very
     slow.
     >     So Dave Fournier suggested to use the AD Model Builder Software
     package and
     >     helped me implement the model there.
     >     ADMB was incredibly fast in running the model:
     >     While running the model in R took 5-10 minutes, depending on the
     settings,
     >     in ADMB it took 1-2 seconds!
     >     I'm reporting this so that people who have performance issues with
     nonlinear
     >       statistical models in R will know that there is a good free
     alternative for
     >     more difficult problems.
     >     There  is also a help platfrom equivalent to the one for R, and
     people
     >     running it are extremley helpful.
     >     I hope this might help someone
     >     cheers
     >     Beni
     > ______________________________________________
     > R-help at r-project.org mailing list
     > [1]https://stat.ethz.ch/mailman/listinfo/r-help
     > PLEASE do read the posting guide
     [2]http://www.R-project.org/posting-guide.html
     > and provide commented, minimal, self-contained, reproducible code.
     >
     Hi Beni,
     Thanks for posting information that might be useful for people on the
     list. The only thing is that without a little more detail on how exactly
     you implemented things in R, we are left to guess if the performance
     issues are a problem of R, or that your particular implementation was
     the problem. There are was of implementing R code in two ways, where the
     first takes minutes and the second 1-2 seconds. Furthermore, you are
     giving us no option to defend R ;).
     cheers,
     Paul
     --
     Drs. Paul Hiemstra
     Department of Physical Geography
     Faculty of Geosciences
     University of Utrecht
     Heidelberglaan 2
     P.O. Box 80.115
     3508 TC Utrecht
     Phone:  +3130 274 3113 Mon-Tue
     Phone:  +3130 253 5773 Wed-Fri
     [3]http://intamap.geo.uu.nl/~paul
     [4]http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

References

   1. https://stat.ethz.ch/mailman/listinfo/r-help
   2. http://www.r-project.org/posting-guide.html
   3. http://intamap.geo.uu.nl/~paul
   4. http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770