[R-sig-ME] Question about misspecified multilevel model

Sat Aug 6 07:50:33 CEST 2016

Hi Chris,

Since you have a concrete example why not just calculate the models and see what happens? The intercept-only model that you proposed is quite simple and fast to compute:

Random effects:
 Groups           Name        Variance Std.Dev.
 classid:schoolid (Intercept)   99.23   9.961  
 schoolid         (Intercept)   77.49   8.803  
 Residual                     1028.23  32.066  
Number of obs: 1190, groups:  classid:schoolid, 312; schoolid, 107

Now, if you do a model with just classroom, then you want this model:

class.mod <- lmer(mathgain ~ (1 | schoolid:classid), data = classroom)

(I'm assuming that classid isn't globally unique, but rather only unique within schools, otherwise the / and : operators add a bit of unnecessary overhead, but that's a different issue)

which has the following random effects: 

Random effects:
 Groups           Name        Variance Std.Dev.
 schoolid:classid (Intercept)  180     13.42   
 Residual                     1025     32.02   
Number of obs: 1190, groups:  schoolid:classid, 312

As you can see, the classid absorbed most of the variance. If you look at the fixed effects, then you'll see slight differences in the error estimates there as well -- models with incomplete random-effect structures typically have error estimates that are two small and are thus anti-conversative. This can be attributed to the variance-bias tradeoff or equivalently overfitting (see e.g. the new book Statistical Rethinking or any of the other classics mentioned here such as Pinheiro & Bates or Gelman & Hill). Roughly, leaving out random effects structure implies that the data are independent in in a way that they are not, which leads you to overestimate how much information you actually have and thus how little error.

Now, if you're curious you can also compare log likelihood if you refit the models via ML*:

> anova(class.mod,correct.mod,refit=TRUE)
refitting model(s) with ML (instead of REML)
Data: classroom
Models:
class.mod: mathgain ~ (1 | schoolid:classid)
correct.mod: mathgain ~ (1 | schoolid/classid)
            Df   AIC   BIC  logLik deviance  Chisq Chi Df Pr(>Chisq)   
class.mod    3 11785 11800 -5889.5    11779                            
correct.mod  4 11779 11800 -5885.7    11771 7.5906      1   0.005867 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

And you see that the model fits while tantalisingly close are nonetheless significantly different. 

Now, I'm not sure what will happen with more complex fixed-effects or random-effects structures, but my guess is that generally the variance will be allocated to whatever variable is "close enough" (here classroom is close enough to school because a collection of classrooms makes up a school) and if there are none close enough, the remaining variance will just contribute to the residual variance (which didn't happen here).

* The fixed-effects model matrix is the same for both models so that you could potentially compare REML-fitted models, but that also gets you into all sorts of fun debate about the pros and cons of REML and it's just easier to avoid all the issues with comparisons of REML-fit

Best,
Phillip

> On 6 Aug 2016, at 05:19, Christopher David Desjardins <cddesjardins at gmail.com> wrote:
> 
> Hi,
> 
> I have a question that's potentially off-topic but that I'm hoping that
> someone here can shed some insight on.
> 
> Assume that I know that I know my true model and that my true is a
> three-level model. My observations are such that I have a measurement on a
> student nested within a classroom nested within a school. The true model
> would be:
> 
> Y_ijk = pi_0jk + e_ijk  # student within classroom within schools (1st
> level)
> 
> pi_0jk = beta_j0k + r_p0k  # classroom within schools (2nd level)
> 
> beta_j0k = gamma_pq0 + u_pqk  # schools (3rd level)
> 
> The model in lmer would be:
> classroom <- read.csv("http://www-personal.umich.edu/~bwest/classroom.csv")
> library("lme4")
> correct.mod <- lmer(mathgain ~ (1 | schoolid/classid), data = classroom)
> 
> What I am wondering about is, if I were to omit that second level, the
> whole classroom within schools equation, where would that variance that
> would end up as the random intercept go? Would it go to the random
> intercept for school or would it go down to the residual?  The model I am
> referring to is below:
> 
> misspecified.mod <- lmer(mathgain ~ (1 | schoolid), data = classroom)
> summary(correct.mod); summary(misspecified.mod)
> 
> It looks like the variances for both the residual and the random intercept
> for school change. But maybe they do in a predictable way?
> 
> If someone could suggest a paper has an answer or better that would be very
> helpful.
> 
> Chris
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models