[R-sig-ME] [FORGED] Working formula for glmm model in R (bernoulli response)

Fri Mar 20 22:39:29 CET 2020

On 21/03/20 2:50 am, Michael Romanov wrote:

> Thank you very much, Rolf!
> Your advise helped much!
> lme4 works well, although is produces some warninngs:
> Warning messages:
> 1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>    Model failed to converge with max|grad| = 0.527726 (tol = 0.001, 
> component 1)
> 2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
>    Model is nearly unidentifiable: very large eigenvalue
>   - Rescale variables?;Model is nearly unidentifiable: large eigenvalue 
> ratio
>   - Rescale variables?
> Nevertheless, it says that there’s a significant time trend in occupancy:
> Fixed effects:
>                Estimate Std. Error z value Pr(>|z|)
> (Intercept) 94.7762510  1.0176729   93.13   <2e-16 ***
> year        -0.0466590  0.0005135  -90.86   <2e-16 ***
> ======================================
> glmm also works well, though quite slow. What I thought to be an 
> infinite loop, turned to be very long loop. It is running about an hour, 
> then produces some output. It also warns about lack of convergence.
> In glmmTMB package the model doesn’t converge at all, so it is useless 
> for me since I’m interested more in p-values of time trend than in 
> slope. I do not see how it is possible to rescale since they are binary.

You should be re-scaling your numeric predictor ("year"), *not* the 
response!!!

> I’m still working on it. Probably there will be more questions if you 
> don’t mind.

I don't really mind, but I probably won't be able to help you very much. 
  I'm not an expert in this area; I'm a user rather than a developer. 
You'd be better off seeking help on r-sig-mixed-models, to which I have 
taken the liberty of CC-ing this email.  You should keep posts on this 
topic on-list.  There are people on the list who have far more insight, 
knowledge and expertise.  They would be much more likely to be able to 
help you.

You will probably have to provide your data --- or a sample thereof; or 
perhaps a simulated version thereof (if the data are confidential; IMHO 
data should *never* be confidential, but that's another story) --- if 
you really want to get serious help.

cheers,

Rolf

> Best regards,
> Michael
> 
>     Пятница, 20 марта 2020, 10:33 +11:00 от Rolf Turner
>     <r.turner using auckland.ac.nz>:
> 
>     On 20/03/20 2:14 am, Michael Romanov via R-sig-mixed-models wrote:
> 
>      > Dear all,
>      >
>      > Sorry for the silly question, but I got stuck on it.
> 
>     The only silly question is the one that is not asked.
>      >
>      > My dataset is occupancy of eagle territories in different years (see
>      > the example below). I’m trying to test my data for time trend, taking
>      > into account territory quality (as a random effect).
>      >
>      > territory year occupied
>      > CHV-1 2006 0
>      > CHV-12 2006 0
>      > CHV-120 2006 1
>      > CHV-13 2009 0
>      > CHV-14 2009 1
>      > CHV-15 2010 1
>      > CHV-16 2010 1
>      >
>      > My thoughts are following. ‘year’ is a fixed effect variable and
>      > ‘territory’ is a random effect variable. For example, in lme4 package
>      > the formula could be following: occupied = year + (1|territory).
>      > However, lme4 doesn’t support Bernoulli family,
> 
>     This is simply NOT TRUE!!! The Bernoulli family is just a special case
>     of the binomial family. See below.
> 
>     OTOH, the glmer() function from lme4 seems reluctant to handle the
>     simple model that you appear to have in mind. (Again, see below.)
> 
>     This, in my experience, is a recurring problem with lme4 --- it cannot
>     deal with the simple models that constitute edge cases.
> 
>     (It is also quite possible that I've got the syntax wrong. Perhaps a
>     younger and wiser head will chime in.)
> 
>      > so I chose glmm package. According to its specifications, the glmm
>      > function accepts two different formulae, separately for fixed and
>     random
>      > effects: >
>      > glmm(fixed, random, varcomps.names, data, family.glmm, m,
>      > varcomps.equal, doPQL = TRUE,debug=FALSE, p1=1/3,p2=1/3, p3=1/3,
>      > rmax=1000,iterlim=1000, par.init, zeta=5, cluster=NULL)
>      >
>      > So now I need two different formulae. I tried ‘occupied ~ year’,
>      > ‘occupied ~ territory’, but it doesn’t work. Namely, the RStudio gets
>      > into an infinite loop and doesn’t produce any output.
> 
>     Please DO NOT confuse RStudio with R. RStudio is simply an interface
>     that allows those mentally handicapped people who can only use GUIs to
>     access R. :-)
> 
>      >
>      > Can anyone help me to write a working formula (or formulae) to
>     properly
>      > run the glmm model?
> 
>     I have no experience with glmm and don't really understand what you are
>     trying to model. Moreover you did not supply a reproducible example, as
>     you are instructed to do by the Posting Guide. This makes it very
>     difficult to help you. However I persisted, and constructed a simulated
>     toy example data set, and fitted a model to it:
> 
>     library(glmm)
>     set.seed(42)
>     X <- data.frame(territory=factor(paste0("CHV-",
>                                        sample(6:10,200,TRUE))),
>                       year=sample(2001:2020,200,TRUE),
>                       occupied=sample(0:1,200,TRUE))
>     fit1 <- glmm(fixed=occupied~year,random=occupied~territory,
>                   family=bernoulli.glmm,varcomps.names="territory",
>                   data=X,m=1e3)
> 
>     No infinite loops seem to have arisen. Note that coef(fit1) gives:
> 
>      > (Intercept) year
>      > 23.17057405 -0.01164075
> 
>     The "variance" component estimate is:
>      > Estimate ...
>      > territory 1.533e-05 ...
> 
>     Now back to lme4:
> 
>     X$unoccupied <- 1-X$occupied
>     library(lme4)
>     fit2 <- glmer(cbind(occupied,unoccupied) ~ year + (1|territory),
>                     family=binomial(),data=X)
> 
>     This seems not to work properly; it says:
> 
>      > boundary (singular) fit: see ?isSingular
> 
>     However the fixed effects in fit2 are effectively the same as those
>     in fit1:
>      > Fixed Effects:
>      > (Intercept) year
>      > 23.18216 -0.01164
> 
>     The variance component is said to be 0.
> 
>     I then tried the glmmTMB() package:
> 
>     library(glmmTMB)
>     fit3 <- glmmTMB(cbind(occupied,unoccupied) ~ year + (1|territory),
>                     family=binomial(),data=X)
> 
>     This gave a mild squawk about a "Model convergence problem" but produced
>     an answer with essentially the same fixed effect estimates as the other
>     two methods:
> 
>      > Estimate ...
>      > (Intercept) 23.18216 ...
>      > year -0.01164
> 
>     The variance component estimate was vanishingly small: 1.062e-91.
> 
>     Bottom line: I think I'd go with the glmmTMB package for your problem,
>     but glmm seems to work, if you want to stick with that.
> 
>     HTH