[R-sig-ME] Model is nearly unidentifiable with glmer

Wed Feb 19 13:40:32 CET 2014

Dear All,

Summary:
========

I am running some models with glmer that are giving, among others, the
warning message

In checkConv(attr(opt, "derivs"), opt$par, checkCtrl = control$checkConv,  :
  Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?

but I cannot understand why the model is nearly unidentifable, based both
on the details of the experimental design, and by comparison with other
approaches (that run without complains and provide similar results).
Moreover, I have no idea what I am supposed to rescale.

Details:
========

The call to glmer
-----------------

gt.0 <- glmer(nS ~ Tree + Model + sh + S.Type + S.Time + S.Size + Method +
                            (1|dataSetID) + (1|obs), 
                            family = poisson,
                            data = NP.all,
                            control = glmerControl(
                                check.conv.singular="warning",
                                optCtrl = list(maxfun = 10000))
 				)

The model is for Poisson data, allowing for overdispersion (with the obs
variable) and with a "dataSetID" random effect. (The glmerControl has been
added to try to understand what might be happening).

The design
----------

The data come from a simulation experiment. For each combination of

Tree * Model * sh * S.Type * S.Time * S.Size there are 20 replicate
simulations (each identified by a dataSetID), so a full factorial
design. Each variable is a factor variable (6 levels for Tree, 4 for Model,
2 for sh, 3 for S.Type, 2 for S.Time, 3 for S.Size)

Each replicate simulation, identified by a "dataSetID", is subject to four
Methods (a factor). That is why dataSetID is a random effect.

There are no missing values.

Thus, there are 17280 dataSetID groups (6*4*2*3*2*3*20), each with four
observations, resulting in a total of 69120 observations. glmer does
report these values just fine.

Therefore, based on that design, I think I should be able to fit not just
the model above, but a model with all possible interactions. In fact, that
model (a saturated model in the log-linear parlance, IIUC) is what I want
to start from.

The warnings
------------

glmer gives two warnings:
1: In checkConv(attr(opt, "derivs"), opt$par, checkCtrl = control$checkConv,  :
  Model failed to converge with max|grad| = 392.339 (tol = 0.001)
2: In checkConv(attr(opt, "derivs"), opt$par, checkCtrl = control$checkConv,  :
  Model is nearly unidentifiable: very large eigenvalue
 - Rescale variables?

The failure to converge I don't like, but OK. However, the "nearly
unidentifiable" I just don't understand. And what variables am I supposed
to rescale?

Removing the random effect for obs or for dataSetID does not eliminate the
warnings (and, in fact, that should not be the problem, since both obs and
dataSetID get non-zero estimates).

Other methods
-------------

I have fitted the above model with:

- MCMCglmm (using the priors list(R = list(V = 1, nu = 0.002), G = list(G1
  = list(V = 1, nu = 0.002))))

- bglmer (with all as per default)

- glmer2stan (here using a more complex model with a bunch of interactions)

- glmmadmb (with family "nbinom")

and even if glmmadmb does complain about lack of convergence, there are no
further problems.

Moreover, with MCMCglmm I've fitted the model with interactions. Again, no
apparent problems (beyond slow mixing in some cases).

In addition, I have also fitted several other models that include all
possible interactions, just to make sure again I do not have something
silly in the model matrix. In some cases, the response is different (as
when fitting a lm/lmer). Beyond the lack of convergence of lmer fit, no
other problems.

## a GLM with interactions but, of course, without the dataSetID random
   effect.

glm(nS ~ Tree * Model * sh *
                             S.Type * S.Time * S.Size * Method,
                             data = NP.all,
                             family = poisson
                             )

## a linear model with interactions 			     

lm(Dissim ~ Tree * Model * sh *
                             S.Type * S.Time * S.Size * Method,
                             data = NP.all
                             )

## a lmers

lmer(Dissim ~ Tree * Model * sh *
                             S.Type * S.Time * S.Size * Method +
                             (1|dataSetID),
                             data = NP.all,
                             control = lmerControl(
                                 check.conv.singular="warning",
                                 optCtrl = list(maxfun = 10000))
                             )
## This complaints of lack of convergence, but no identifiability
                             problems. 

(Lack of) Differences between fits
-----------------------------------

If anyone wants I can provide, of course, the output from the
fits. Anyway, the qualitative summary is that the the estimates for the
fixed are very similar between glmer, admb, bglmer, and MCMCglmm and
the estimates for the random effects are very similar for MCMCglmm,
bglmer, and glmer.

So, what am I missing?

Best,

R.

-- 
Ramon Diaz-Uriarte
Department of Biochemistry, Lab B-25
Facultad de Medicina 
Universidad Autónoma de Madrid 
Arzobispo Morcillo, 4
28029 Madrid
Spain

Phone: +34-91-497-2412

Email: rdiaz02 at gmail.com
       ramon.diaz at iib.uam.es

http://ligarto.org/rdiaz