[R-sig-ME] Why lme4 doesn't throw an error for illegitimate an model

Wed Oct 7 22:58:10 CEST 2020

Without knowing the structure of the data, it's hard to answer the
question ... there are ways for me explore and find out what the
nesting/grouping structure is, but that takes time I don't have. So
here's a quick attempt just using general principles.

I think the first trick is to stop thinking of thinking as an ith-level
predictor. For one thing, the levels don't have to be strictly nested.
Personally, I find the i-th level terminology confusing because I can
never remember which direction we're counting from.

Dropping the lm/ler calls for convenience in typing:

math ~ 1 + ses

this estimates a model with a slope for ses and an intercept.

math ~ 1 + ses + (1|sch.id)

this estimates a model with a slope for ses and an intercept and an
offset for each sch.id. (The estimated offsets are technically
"predictions", because the variance of the offsets is what's being
estimated, but we'll leave that be for now.)

An offset to what? Well the population-level/fixed effect corresponding
to the intercept. So you can think of this random effect as providing an
adjustment to the intercept for each sch.id

math ~ 1 + ses + (1+ses|sch.id)

this estimates a model with a slope for ses and an intercept; as well as
per-sch.id adjustments for the intercept and ses.

Note by the way that the model

math ~ 1

is a special case of

math ~ 1 + ses

with the slope of ses set / assumed to be zero.

This will help in the next step.

math ~ 1 + (1+ses|sch.id)

this estimates a model with a slope an intercept; as well as per-sch.id
adjustments for the intercept and ses. But where is ses in the fixed
effects? Well, you can think of it as being zero (see previous point),
so the adjustments will be from the assumed slope of zero, instead of
the estimated slope.

This brings us to

math ~ 1 + ses + (1 + sector | sch.id)

this estimates a model with a slope an intercept; as well as per-sch.id
adjustments for the intercept and sector.

This is mathematically well-defined, even if there is only one value of
sector observed for each sch.id (which is the case when sch.id is nested
within sector), because the shrinkage of the random effects deals with
the rank deficiency. (If you didn't understand that sentence: it's still
possible to estimate these quantities.) In the case that sector doesn't
vary within sch.id, you'll get an estimate that reflects this: either it
will be perfectly correlated with the intercept or shrunk to zero. In
other words, that's one way to get a singular/boundary fit.

Now mathematically well-defined doesn't mean that it makes sense
inferentially. And that's where it's incumbent upon the user to think
about their inferential question, their data, and their phrasing of the
inferential question as a statistical model.

Phillip

On 7/10/20 9:30 pm, Simon Harmel wrote:
> correction:
> 
> As far as I understand, in a 2-level model, the use of a level-2 predictor
> (here "sector") in the ***random part*** is illegitimate.
> 
> But I wonder why in the following model `lmer` doesn't throw an error to
> indicate that? What does the random part estimates in the output?
> 
> library(lme4)
> hsb <- read.csv('
> https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv')
> 
> mn <- lmer(math ~ ses + (sector | sch.id), data = hsb)
> 
> 
> 
> On Wed, Oct 7, 2020 at 2:24 PM Simon Harmel <sim.harmel using gmail.com> wrote:
> 
>> Dear All,
>>
>> As far as I understand, in a 2-level model, the use of a level-2 predictor
>> (here "sector") is illegitimate.
>>
>> But I wonder why in the following model `lmer` doesn't throw an error to
>> indicate that?
>>
>> library(lme4)
>> hsb <- read.csv('
>> https://raw.githubusercontent.com/rnorouzian/e/master/hsb.csv')
>>
>> mn <- lmer(math ~ ses + (sector | sch.id), data = hsb)
>>
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-mixed-models using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>