[R-sig-ME] non-singularity with glmer() in a logit mixed model

Fernando Pedro Bruna Quintas |@brun@ @end|ng |rom udc@e@
Sun Feb 14 13:56:35 CET 2021

Hi. This is my first time in the list. I apologize if I make a mistake.

I am answering the comments of our reviewers after the submission of a paper. I mention this because the comments were positive, so the paper should be close to publication. However, one of the reviewers ask me to introduce two additional level-2 variables and the estimation becomes singular. I am a user of lme4 but my field is Economics, not Statistics.

The focus of the paper is to compare the results of two dependent variables, Y.A and Y.B for firms nested in 24 regions. I am estimating mixed logit models. The new level-2 variables suggested by the reviewer work fine for both dependent variables, with the expected signs and p-value close to zero (the journal uses p-values, so, please, ignore the debate about p-values here). However, the definition of Y.A is much more restrictive though more interesting than the definition of Y.B. Therefore, for Y.A there are far fewer ones for firms in the 24 regions. I know that 24 groups are not too many but, according to the literature, should be enough and they are enough for variable Y.B. However, the model of Y.A is non-singular when estimating with 6 lev-el-2 variables. 

Additionally, the reviewers ask me to write the tables of results for Y.A and Y.B with the same explanatory variables, even if some of them are non-significant. I might tell them that it would be better to avoid that. However, playing around to reduce the number of level-2 variables produce unclear results, as you can check in the following lines. I avoid the non-singular warning in the last column, but the deletion of variables is somewhat random. ICC goes from 0.12 when only level-1 variables are included, to the non-singular situation for 6 level-2 variables and to 0.01 after an arbitrary deletion of three level-2 variables.

Level 2 standardized group-centered predictors for the dependent variable Y.A
First column is the model only including level-1 variables
The rest of the columns are alternative models trying to avoid the non-singular warning									
		Odds	p	Odds 		p	Odds 		p		    Odds 		p			Odds 		p	Odds 	p	
X1					1.19		  0.034		1.32		<0.001		1.19		  0.035		1.37		<0.001	1.34		<0.001
X2					0.86		  0.003		0.86		  0.004		0.83		<0.001		0.84		<0.001		
X3					1.09		  0.134		1.14		  0.012						
X4					1.27		<0.001		1.28		<0.001		1.26		 <0.001	1.26		<0.001	1.27		<0.001
X5					3.79		<0.001		3.89		<0.001		3.79		 <0.001	3.92		<0.001	3.89		<0.001
X6					0.82		  0.090		0.77		  0.011				
Random Effects												
sigma2	3.29				3.29						3.29						3.29						3.29				3.29	
Tau00	0.44 region	0.00 region			0.00 region			0.00 region			0.00 region		0.03 region	
ICC	    0.12											0.01	
N			24 region		24 region				24 region				24 region				24 region		24 region	
Obs.		6122			6122					6122					6122					6122			6122	
Marg. R2	0.245 		0.490 					0.490 					0.490 					0.487 			0.488 
Cond. R2	0.333		NA						NA						NA						NA				0.492	
Devi.	4.797.692	3.895.819			3.900.349			3.899.897			3.906.505		3.915.146

As I mentioned before, data about variable Y.A may not contain enough information (variance) to discriminate between so many level-2 effects. However, now I should take decisions to finish the paper, summarize some useful technical explanations and derive possible economic explanations.

My questions are:
-	Is it possible to solve my non-singularity keeping all the explanatory variables? Below you can find the options I am using to estimate the model.
-	Should I ignore the non-singular warning? I understand that the cause of the problem is also that in a logit model adding more level-2 variables only reduces the level 2 variance, towards zero in my model. Indeed, one of the reviewers did not understand our explanation about the fixed lev-el-1 variance, so maybe it is better to avoid confusion about that.
-	In the same vein, should I avoid publishing ICC? I had thought that it was useful but some au-thors of papers with multilevel logit models do not publish ICC. 
-	If there is no solution, because of lack of information, do you have a suggestion or a reference with similar issues to provide some insights with this model in spite of the problem?

Thanks a lot,



> Mod.Y.A	<- glmer(update(eq2id, ~ . + (1 | region) ), family = binomial("logit"), data = inno, nAGQ = 9,  
+ control=glmerControl(optimizer="bobyqa", optCtrl=list(maxfun=100000), calc.derivs = FALSE))
> isSingular(Mod.Y.A)	
[1] TRUE

> require(optimx)
> library(dfoptim)			 
> Meq2id.allFit <- allFit(Mod.Y.A)
bobyqa : [OK]
Nelder_Mead : [OK]
nlminbwrap : [OK]
nmkbw : [OK]
optimx.L-BFGS-B : [OK]
nloptwrap.NLOPT_LN_BOBYQA : [OK]
> sapply(Meq2id.allFit,	logLik) 				## log-likelihoods
                       bobyqa                   Nelder_Mead 
                     -1947.91                      -1947.91 
                   nlminbwrap                         nmkbw 
                     -1947.91                      -1947.91 
              optimx.L-BFGS-B nloptwrap.NLOPT_LN_NELDERMEAD 
                     -1947.91                      -1947.91 
> sapply(Meq2id.allFit,	getME,"theta") 			## theta parameters
> !sapply(Meq2id.allFit,	inherits,"try-error") 	## was fit OK?
                       bobyqa                   Nelder_Mead 
                         TRUE                          TRUE 
                   nlminbwrap                         nmkbw 
                         TRUE                          TRUE 
              optimx.L-BFGS-B nloptwrap.NLOPT_LN_NELDERMEAD 
                         TRUE                          TRUE 

More information about the R-sig-mixed-models mailing list