[R-sig-ME] Meaning of perfect correlation between by-intercept and by-slope adjustments

Sat May 14 14:48:51 CEST 2011

On 11-05-16 02:09 PM, Petar Milin wrote:
> On Fri, May 13, 2011 at 10:51 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
>> On Fri, May 13, 2011 at 3:32 PM, Petar Milin <pmilin at ff.uns.ac.rs> wrote:
>>>
>>> On 13/05/11 22:00, Douglas Bates wrote:
>>>>
>>>> On Fri, May 13, 2011 at 12:35 PM, Petar Milin<pmilin at ff.uns.ac.rs>  wrote:
>>>>>
>>>>> Hello! Simplified model that I have is:
>>>>> lmer(Y ~ F1 + F2 + C1 + (1+F1|participants) + (1|items))
>>>>> F1 and F2 are categorical predictors (factors) and C1 is a covariable
>>>>> (continuous predictor). F1 has five levels.
>>>>> By-participant adjustments for F1 are justified (likelihood ratio test is
>>>>> highly significant). However, what puzzles me is perfect correlation
>>>>> between
>>>>> two levels of F1. Others are quite high, but not perfect. I wonder what
>>>>> this
>>>>> means, exactly? Is there some "lack of information" which leads to
>>>>> problems
>>>>> in estimating variances?
>>>>
>>>> I think of the estimation criterion for mixed models (the REML
>>>> criterion or the deviance) as being like a smoothing criterion that
>>>> seeks to balance complexity of the model versus fidelity to the data.
>>>> It happens that models in which the variance covariance matrix of the
>>>> random effects is singular or nearly singular are considered to have
>>>> low complexity so the criterion will push the optimization to that
>>>> extreme when doing so does not introduce substantially worse fits.
>>>>
>>>> One way around this is to avoid fitting models with vector-valued
>>>> random effects and, instead, use two terms with simple scalar random
>>>> effects, as in
>>>>
>>>> lmer(Y ~ F1 + F2 + C1 + (1|participants) + (1|F1:participants) +
>>>> (1|items))
>>>
>>> I am always hesitant to go for scalar version. As far as I understand, this
>>> implies homoscedasticity across levels of F1, but correct me if I am wrong.
>>> In my model, I am not sure if that would be correct.
>>
>> You are correct.  However, the model with vector-valued random effects
>> is not supported by the data in the sense that it converges to a
>> singular variance-covariance matrix.  When you have 5 random effects
>> associated with each level of participant and you allow the 5 by 5
>> positive semi-definite variance-covariance matrix you are attempting
>> to estimate 15 variance parameters for that one term.  You need a lot
>> of data to be able to do that.
>>
> 
> I am reading various stuff, trying to understand and cope with this
> properly. Bottom line, using vector-valued random effects, in the
> above case -- with a perfec correlation between random adjustments,
> would be a case of overfitting?
> 

  I think so.
  If you wanted a justification for dropping back to the homoscedastic
model, you could compare the likelihoods of the heteroscedastic and
homoscedastic model fits, which you can probably establish are a pair of
nested models (and whose likelihoods may actually be identical).

> Thanks!
> Petar