[R-sig-ME] lme4, failure to converge with a range of optimisers, trust the fitted model anyway?
Hans Ekbrand
hans.ekbrand at gmail.com
Sun Apr 5 02:41:47 CEST 2015
On Sat, Apr 04, 2015 at 07:36:04PM -0400, Ben Bolker wrote:
> > So, perhaps I could aggregate the individual level data to the
> > cluster level, and do without the random term for cluster? I mean,
> > calculate the proportions of yes in each cluster and use that as
> > the dependent variable.
> >
> > This would, I assume, require that each cluster was given a weight
> > that corresponded to the number of individuals in it - or I would
> > not be able to say anything about probabilities at the individual
> > level, right?
>
> I would say pretty much any time you can aggregate without losing
> information, you should, for ease of computation (this accords with
> the Murtaugh 2007 "Simplicity and complexity" paper that I cite here
> on a regular basis). If all of your covariates are at the cluster
> level
[ Thanks for your advice, it's much appreciated. ]
The covariates are not all at the cluster level. Actually only one
covariate is at the cluster level, the others are at the country
level.
> then this reduces to a binomial GLM (you can account for
> overdispersion either by using a quasi-binomial model in glm() or by
> staying with glmer() and keeping the random effect of cluster (now an
> observation-level random effect).
Since I still have the country level to account for, I need to use glmer().
I have created an aggregated data set with a variable for the
proportion deprived=TRUE in each cluster, and a variable for the
number of cases in that cluster (to use as weight).
Perhaps this is not a question specific for mixed models, but what
family and link function would be appropriate now? This was the best I
could come up with:
library(car)
lm1 <- lmer(logit(Prop.deprived.in.cluster) ~ (1|Country) + QoG + GDPLog + Rural * KilledPerMillion5Log, data = my.small.df, weights = my.small.df$weight)
Linear mixed model fit by REML ['lmerMod']
Formula: logit(Prop.deprived.in.cluster) ~ (1 | Country) + QoG + GDPLog +
Rural * KilledPerMillion5Log
Data: my.small.df
Weights: my.small.df$weight
REML criterion at convergence: 157709.8
Scaled residuals:
Min 1Q Median 3Q Max
-4.9983 -0.5626 -0.0622 0.4371 6.1171
Random effects:
Groups Name Variance Std.Dev.
Country (Intercept) 1.768 1.33
Residual 139.292 11.80
Number of obs: 38028, groups: Country, 65
Fixed effects:
Estimate Std. Error t value
(Intercept) 1.25542 1.90567 0.66
QoG 0.39120 0.40850 0.96
GDPLog -0.51799 0.23031 -2.25
Ruralyes 1.91545 0.02232 85.80
KilledPerMillion5Log 0.06190 0.28557 0.22
Ruralyes:KilledPerMillion5Log 0.35303 0.03728 9.47
Correlation of Fixed Effects:
(Intr) QoG GDPLog Rurlys KlPM5L
QoG 0.571
GDPLog -0.989 -0.486
Ruralyes -0.013 0.001 0.006
KlldPrMll5L -0.013 0.239 -0.009 0.045
Rrlys:KPM5L 0.004 -0.001 0.000 -0.519 -0.084
my.small.df is available here if someone wants to have a go at it
http://hansekbrand.se/code/my.small.df.RData (1.1 MB)
More information about the R-sig-mixed-models
mailing list