[R-sig-ME] non-singularity with glmer() in a logit mixed model
Fernando Pedro Bruna Quintas
|@brun@ @end|ng |rom udc@e@
Sun Feb 14 13:56:35 CET 2021
Hi. This is my first time in the list. I apologize if I make a mistake.
I am answering the comments of our reviewers after the submission of a paper. I mention this because the comments were positive, so the paper should be close to publication. However, one of the reviewers ask me to introduce two additional level-2 variables and the estimation becomes singular. I am a user of lme4 but my field is Economics, not Statistics.
The focus of the paper is to compare the results of two dependent variables, Y.A and Y.B for firms nested in 24 regions. I am estimating mixed logit models. The new level-2 variables suggested by the reviewer work fine for both dependent variables, with the expected signs and p-value close to zero (the journal uses p-values, so, please, ignore the debate about p-values here). However, the definition of Y.A is much more restrictive though more interesting than the definition of Y.B. Therefore, for Y.A there are far fewer ones for firms in the 24 regions. I know that 24 groups are not too many but, according to the literature, should be enough and they are enough for variable Y.B. However, the model of Y.A is non-singular when estimating with 6 lev-el-2 variables.
Additionally, the reviewers ask me to write the tables of results for Y.A and Y.B with the same explanatory variables, even if some of them are non-significant. I might tell them that it would be better to avoid that. However, playing around to reduce the number of level-2 variables produce unclear results, as you can check in the following lines. I avoid the non-singular warning in the last column, but the deletion of variables is somewhat random. ICC goes from 0.12 when only level-1 variables are included, to the non-singular situation for 6 level-2 variables and to 0.01 after an arbitrary deletion of three level-2 variables.
Level 2 standardized group-centered predictors for the dependent variable Y.A
First column is the model only including level-1 variables
The rest of the columns are alternative models trying to avoid the non-singular warning
Odds p Odds p Odds p Odds p Odds p Odds p
X1 1.19 0.034 1.32 <0.001 1.19 0.035 1.37 <0.001 1.34 <0.001
X2 0.86 0.003 0.86 0.004 0.83 <0.001 0.84 <0.001
X3 1.09 0.134 1.14 0.012
X4 1.27 <0.001 1.28 <0.001 1.26 <0.001 1.26 <0.001 1.27 <0.001
X5 3.79 <0.001 3.89 <0.001 3.79 <0.001 3.92 <0.001 3.89 <0.001
X6 0.82 0.090 0.77 0.011
sigma2 3.29 3.29 3.29 3.29 3.29 3.29
Tau00 0.44 region 0.00 region 0.00 region 0.00 region 0.00 region 0.03 region
ICC 0.12 0.01
N 24 region 24 region 24 region 24 region 24 region 24 region
Obs. 6122 6122 6122 6122 6122 6122
Marg. R2 0.245 0.490 0.490 0.490 0.487 0.488
Cond. R2 0.333 NA NA NA NA 0.492
Devi. 4.797.692 3.895.819 3.900.349 3.899.897 3.906.505 3.915.146
As I mentioned before, data about variable Y.A may not contain enough information (variance) to discriminate between so many level-2 effects. However, now I should take decisions to finish the paper, summarize some useful technical explanations and derive possible economic explanations.
My questions are:
- Is it possible to solve my non-singularity keeping all the explanatory variables? Below you can find the options I am using to estimate the model.
- Should I ignore the non-singular warning? I understand that the cause of the problem is also that in a logit model adding more level-2 variables only reduces the level 2 variance, towards zero in my model. Indeed, one of the reviewers did not understand our explanation about the fixed lev-el-1 variance, so maybe it is better to avoid confusion about that.
- In the same vein, should I avoid publishing ICC? I had thought that it was useful but some au-thors of papers with multilevel logit models do not publish ICC.
- If there is no solution, because of lack of information, do you have a suggestion or a reference with similar issues to provide some insights with this model in spite of the problem?
Thanks a lot,
> Mod.Y.A <- glmer(update(eq2id, ~ . + (1 | region) ), family = binomial("logit"), data = inno, nAGQ = 9,
+ control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=100000), calc.derivs = FALSE))
> Meq2id.allFit <- allFit(Mod.Y.A)
bobyqa : [OK]
Nelder_Mead : [OK]
nlminbwrap : [OK]
nmkbw : [OK]
optimx.L-BFGS-B : [OK]
nloptwrap.NLOPT_LN_NELDERMEAD : [OK]
nloptwrap.NLOPT_LN_BOBYQA : [OK]
> sapply(Meq2id.allFit, logLik) ## log-likelihoods
> sapply(Meq2id.allFit, getME,"theta") ## theta parameters
> !sapply(Meq2id.allFit, inherits,"try-error") ## was fit OK?
More information about the R-sig-mixed-models