[R-sig-ME] Specifying a (simple?) two level model
Reinhold Kliegl
reinhold.kliegl at gmail.com
Thu Jun 30 11:30:18 CEST 2011
Your first model is asking too much. You are estimating 22 variance
components and (22*21)/2 correlation parameters for them.
Your second model is a good start for your question. Here is what the
model returns for you:
> ranef(my.fit)
This values are what Douglas Bates prefers to call "the conditional
modes of the random effects". To quote him: "If you want to be
precise, these are the conditional modes of the random effects B given
Y = y, evaluated at the parameter estimates."
Basically, they give you relative positions of countries and clusters
to the intercept, taking into account the reliability (i.e., n of
observations) you have for the other factors levels. So adding the
terms yields a "prediction" on the basis of the two random factors.
This model does not give you country-specifc effects. One way to
model the interaction is to assume that there is a random effect for
each county and a separate random effect for each combination of
country and employment history. If the random effects for these
combinations are assumed to be independent with constant variance,
then the following model is appropriate:
m2 <- glmer(poverty.third.year ~ 1 + cluster + (1 | country) + (1 |
country:cluster), family = binomial("logit"), data = poverty.risks)
This model still generates 499 conditional modes, but uses only 2
variance components plus residual variance plus 21 fixed effects for
clusters. This may be a good compromise or at least starting point.
It completed in about 20 minutes on my machine.
Reinhold Kliegl
On Thu, Jun 30, 2011 at 8:49 AM, Hans Ekbrand <hans at sociologi.cjb.net> wrote:
> Hi this is my first post to the list. Am new to mixed models, but I
> think I have managed to specify my rather simple modelling problem
> correct. The problem I have is that the computation never seems to
> finish (I waited for 10 hours before giving up).
>
> I am trying to model how risks of poverty vary with labour market
> position, while letting the effects of labour market position vary
> over countries.
>
> Here is a sample of the dataset, if you want to try it out
>
>> print(load(url("http://code.cjb.net/temp/pov.temp.RData")))
> [1] "poverty.risks"
>> str(poverty.risks)
> 'data.frame': 161348 obs. of 3 variables:
> $ poverty.third.year: logi FALSE FALSE FALSE FALSE TRUE FALSE ...
> $ country : Factor w/ 22 levels "sweden","unitedkingdom",..: 1 1 1 1 1 1 1 1 1 1 ...
> $ cluster : Factor w/ 22 levels "Unemployed - Unemployed",..: 16 20 16 16 18 20 16 1 16 2 ...
>
> Labour market position is a factor that summaries a history of
> labourmarket positions for three year, where "Unemploed - Unemployed"
> means that the individual was unemployed at time0 and at time1.
>
> Here my specification:
>
> my.fit <- glmer(poverty.third.year ~ cluster + (1 + cluster | country), family = binomial("logit"), data = poverty.risks)
>
> I saw, in Bates Chapter 2, that you could split the random terms in (1 | cluster) + (1 | country). Also, am not sure wether or not to include cluster as fixed term. If I split the random terms and skip cluster as a fixed term, then the computation takes only a few seconds.
>
> my.fit <- glmer(poverty.third.year ~ 1 + (1 | cluster) + (1 | country), family = binomial("logit"), data = poverty.risks)
>
> summary(fit)
> Generalized linear mixed model fit by the Laplace approximation
> Formula: poverty.third.year ~ 1 + (1 | cluster) + (1 | country)
> Data: poverty.risks
> AIC BIC logLik deviance
> 103922 103952 -51958 103916
> Random effects:
> Groups Name Variance Std.Dev.
> cluster (Intercept) 0.54046 0.73516
> country (Intercept) 0.17247 0.41530
> Number of obs: 161348, groups: cluster, 22; country, 22
>
> Fixed effects:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -2.0760 0.1807 -11.49 <2e-16 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> My understanding, which I hope is wrong, is that this model does not
> compute country specific poverty risks for each cluster.
>
> If the first model is the wright one for me, then for how long would
> it be reasonable to wait the computation to terminate?
>
> --
> Hans Ekbrand <hans at sociologi.cjb.net>
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
More information about the R-sig-mixed-models
mailing list